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(54) Novel LOL receptor analog protein and the gene coding therefor 

(57) The present invention is drawn to the gene of a 
novel LDL receptor family receptor which participates in 
lipoprotein metabolism, a critical factor that triggers the 
onset of arteriosclerosis. 

The invention provides DNA having a nucleotide 
sequence as shown by Sequence ID No. 1 or No.5 is 
disclosed as well as rabbit tissue or human tissue LDL 
receptor analog protein having an amino acid sequence 
of Sequence ID No. 2 or 6 coded by such DNA. 
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Description 

Background of the Invention : 

1) Field of the invention 

The present invention relates to a novel LDL receptor analog protein having a structure similar to that of LDL recep- 
tors that are responsible for tne homeostasis mechanism of intracellular cholesterol and extensively participates m 
serum lipid metabolism, which is a critical factor that triggers the onset of arteriosclerosis. The invention also relates to 
the gene coding for the protein. 

2) Description of the Related Art 

Abnormality in serum lipid metabolism is one of the most critical risk factors in the onset and progress of arterio- 
sclerosis. Serum lipids, together with apolipoproteins, are transformed into lipoproteins primarily in the liver, secreted 
therefrom, transported by blood, and taken up by a variety of tissue ceils. 

Uptake of lipoproteins into cells occurs primarily by the mediation of receptors of respective lipoproteins, rt is known 
that low density lipoproteins (LDL), which are taken into cells by specific membrane receptors, called LDL receptors, 
are metabolized within the cells and utilized as cell membrane components or similar substances. Detailed analysis of 
familial hyperchlolesterolemia. which is a genetic disease accompanied by notable hypercnioiesterolemia due to abnor- 
mality of LDL receptors, has clarrfied details of the mechanism of homeostasis achieved by LDL receptors with respect 
to intracellular cholesterol 

It has been suggested that living bodies have not only LDL receptors but also cell membrane receptors that recog- 
nize other lipoproteins From analyses of WHHL rabbits, which are model animals tacking LDL receptors, it was found 
that receptors which takes principally apo-E -containing lipoproteins as ligands (remnant receptors) are present in the 
liver, tt is also predicted that there may be HDL receptors whose ligands are high density lipoprotein (HDL). However, 
to date, details of the structures and functions of these receptors have not yet been elucidated, ft has also been known 
that foaming of macrophages plays an active role in the formation of atherosclerosis, is deeply participated Macro- 
phages foam by taking up modified LDL — not normal LDL — which have undergone oxidation, acetylation, or glycation 
There have recently been aiscovered receptors to modified LDL which are called scavenger receptors. The scavenger 
receptors have been identified to be membrane receptors that have a structure completely different from that of LDL 
receptors. 

Recent research using molecular biological techniques has identified the genes of LRP (LDL receptor-associated 
protein), gp 330, and VLDL receptors. The receptors have been found to have structures very similar to those of LDL 
receptors From analyses of these receptors, it is believed that a plurality of lipoprotein receptors are present «n Irving 
bodies, and that they are closely related to lipid metabolism. LDL receptors studied in detail by Brown and Goldstein 
[Brown M S. and Goldstein, J.L. (1986) Science 232. 34-47] are known to play an important role in the homeostasis of 
lipoprotein metabolism m vva recognizing apo-B-100 and apo-E and taking primarily LDL as their ligands. Also, ~RP, 
which is a macroprotein, has been found to primarily recognize apo-E and to take p-VLDL or chylomicron remnant as 
a l gand Moreover, it has been recently reported that LRP taKes an c^-macroglobulin/protease complex or a plasmino- 
gen activator/plasmtnogen activator inhibitor- 1 complex as a ligand and that LRP is a protein identical to the a 2 -mac- 
roglobulin receptor. When these findings are taken together, LRP is considered to have a wide variety of functions in 
living bodies [Herz, J , Hamann, U., Rogne, S . Myklebost, O.. Gausepohl, H. and Stanley, K K. (1989) EMBO J. 7M3), 
4119-4127 Brown. M.S.. Herz, J., Kowal. R.C and Goldstein. J.L (1991) Current Opinion in Uptdology 2, 65-72, Herz, 
J (1993) Current Opinion in Uptdology 4, 107-1 13] The gp 330, which was first identified as an antigen inducing rat 
Heymann nephritis, has been reported to have a ligand-bindtng capacity similar to that possessed by CRP a 2 -mac- 
rogtobulm receptor [Raychowdhury, R.. Niles, J.L., McCluskey, R T and Smith, J. A. (1989) Science 244, 1163-1165, 
Pietromcnaco. S.. Kerjaschki, D , Binder, S., Ullrich, R. and Farquhar, G. (1990) Proc. Natl. Acad. Sci. U.S.A. 87, 13H- 
1815] In addition recently discovered VLDL receptors, which are found to take VLDL as a ligand, are considered to 
have new functions including fatty acid metabolism, because they are predominantly found in tissues of the heart and 
muscies thougn they are rarely found in the liver [Takahashi, S , Kawarabayashi, Y . Nakai, T , Sakai, J and Yamamoto. 
T (1992) Proc Natl Acad. Sci USA 89. 9252-9256]. 

Functions of these newly found receptors as lipoprotein receptors have been gradually elucidated through detailed 
m vitro analyses. However, significance of respective receptors in living bodies has mostly been left unknown. In addi- 
tion, relations to remnant receptors. HDL receptors, etc.. which have conventionally been identified or suggested by bio- 
chemical techniques, remain unknown. Presently, it is considered that these newly found receptors are products of 
genes different from those of the latter receptors. Thus, more lipoprotein receptors than originally guessed have 
become considered to participate in lipoprotein uptake into cells while interacting with each other to thereby function to 
maintain homeostasis of liptd metabolism in living bodies. However, from structural analyses of the genes of the afore- 
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mentioned newly-identrfied receptors, it is predicted that the genes of these receptors that take lipoproteins as ligands 
are developed from the same gene from which LDL receptors was developed, and thus they are within the same genetic 
family. This suggests that lipoprotein receptors that have conventionally been proposed may have structures similar to 
those of LDL receptors. 

5 Accordingly, an object of the present invention is to provide the gene of a novel receptor in the LDL receptor family, 

as well as a protein coded by the gene. 

The present inventors conducted careful studies so as to attain the above object, and found that by using part of 

rabbit LDL receptor cDNA as a probe there can be obtained a DMA fragment coding for a peptide having a structure 

similar to that of LDL receptors. Moreover, when using part of the obtained cDNA as a probe, a cDNA fragment having 
10 a sequence similar to that of the cDNA can be obtained from the human tissue cDNA library. The present invention was 

accomplished based on these findings. 

Summary of the Invention 

?5 The present invention provides DNA having a nucleotide sequence shown by Sequence ID No. 1 or No. 5, an LDL 

receptor analog protein having an amino acid sequence coded by the DNA, a recombinant vector comprrsing the DNA 
and a rephcabie vector; transformant cells which harbor the recombinant vector; and a method for the production of the 
LDL receptor analog protein. 



2c Description pf Preferred Embodiment 

The cDNA of the present invention may be prepared, for example, by the following process. 
Briefly, the process includes the following steps (1) Through the use of rabbit LDL receptor cDNA as a probe, pos- 
itive clones are screened out of a rabbit liver cDNA library (2) Recombinant DNA is prepared using the separated pos- 

25 ttive clones and a cDNA fragment is cut out of the resultant recombinant DNA through a treatment using a restriction 
enzyme The cDNA fragment is integrated into a plasmid vector. (3) Host cells are transformed using the obtained 
cDNA recombinant vector to thereby obtain transformant cells of the present invention The obtained transformant cells 
are incubated so as to obtain a recombinant vector containing a DNA fragment of the present invention. The nucleotide 
sequence of the DNA fragment of the present invention contained in the resultant recombinant vector is determined. (4) 

20 In tissue of a living body, there is detected expression of mRNA indicated by the nucleotide sequence of the cDNA of 
the present "invention by using RNA blot hybridization method. (5) Through use of a rabbit cDNA fragment as a probe, 
positive clones are screened out of a human tissue cDNA library, and the nucleotide sequence of the clones is deter- 
mined (5) A recombinant vector for expression is prepared using the cDNA of the present invention. Through use of the 
thus-obtained vector, host cells are transformed to thereby obtain the transformants of the present invention (7) Lig- 

35 ands that are bound to protan expressed by the obtained transformants are detected by ligand blotting. 
Each of the above-described steps wiil next be described. 



(1 ) Screening for positive clones from a rabbit liver cDNA library: 



A cDNA library may be prepared by the use of mRNA obtained from rabbit liver, reverse transcriptase, and a suit- 
able vector, e.g., commercially available AgtiO vector 

A cDNA library thus prepared using XgttO as a vector is subjected to a screening for positive clones by the appli- 
cation of a DNA hybridization method employing a cDNA probe, to thereby separate positive clones [Sambrook, J., 
Fntsch, £ F and Maniatis, T. (1989) In; Molecular Cloning: A Laboratory Manual, pp 9.47-9 58, CokJ Spring Harbor Lab- 
oratory Press] 

An exemplary cDNA which may be used as a probe is rabbit LDL receptor cDNA. Positive clones may be detected 
by autoradiography employing a DNA probe labelled with a radioisotope ( 32 P). 

(2) Preparation of a cDNA recombinant vector; 

Recombinant vector Xgt10 phage DNA is extracted from the isolated positive dones and purified The resultant 
purrf.ed recombinant vector xgtlO phage DNA is digested with a restriction enzyme EcoRI. to thereby separate a cDNA 
fragment from the vector DNA The obtained cDNA fragment is integrated with a plasmid vector for clon.ng that has 
been similarly digested with EcoRI. thereby obtaining a recombinant plasmid vector. An exemplary plasmid vector 
which may be used is pBluescript II. 

(3) Recombinant vector, transformation of host cells using the recombinant vector, and preparation of DNA: 

The obtained cDNA recombinant vector is introduced into a variety of host cells that are capable of utilizing the 
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genetic marker possessed by the recombinant vector, to thereby transform the host cells. Host cells are not particularly 
limited, wrth E. coii being preferred. For example, a variety of variants of the E. coli K12 strain, e.g.. HB-101, may be 
used. In order to introduce the recombinant vector into host cells, a competent ceil method may be used in combination 
with a treatment wrth calcium. 

The h us- obtained transformant ceils are cultured in a selective medium in accordance with the genetic marker of 
the vector. The recombinant vector of the present invention is collected from the cultured cells. The DNA nucleotide 
sequence of the cDNA contained in the obtained recombinant vector can be determined through use of a dideoxy 
sequence method [Sanger, R, Nicklen. S. and Coulson, A.R. (1977) Proc. Natl. Acd. Sci. USA 74, 5463-5467]. 

(4) RNA Wot hybridization: 

The expression m tissue of mRNA, indicated by the nucleotide sequence of the cDNA of the present invention, ts 
detected using RNA Wot hybridization. 

First mRNA is prepared using rabbit tissue. Commercially available oiigo(dT)ceiluiose column may be used for the 
preparation in order to prepare mRNA from human tissue, there may be used a commercially available nylon mem- 
brane on which tissue poly(A)*RNA from a variety of sources is present 

An exemplary probe is the rabbit cDNA obtained in the above-described step (3). mRNA may be detected by auto- 
radiography employing a DNA probe labelled with a radioisotope ( 32 P) 

(5) Screening of human tissue cDNA library for positive clones, and determination of nucleotide sequence: 

An exemplary human tissue cDNA library which may be used is a commercially available human brain cDNA 
library. 

Screening and nucleotide sequencing of the human brain cDNA library may be performed using a fragment of rab- 
bit cDNA of the present invention as a probe in a manner similar to that used for the aforementioned rabbit liver cDNA 
library 

(6) Preparation of a recombinarrt vector for expression and transformation of host cells using the recombinant vector for 
expression: 

In order to prepare an LDL receptor analog protein through use of cDNA of the present invention, the obtained 
cDNA and a vector for expression are first bonded to each other to thereby create a recombinant vector for expression 
Vectors for expression whicn may be used for bonding are not particularly limited For example, pBK-CMV may be used 

Host cells are transformed using the thus-obtained recombinant vector for expression, to thereby obtain a trans- 
formant cell of the present invention. The obtained transformant cell is cultured so as to obtain cells that are capable of 
expressing the protein of the invention Host cells are not particularly limited. For example. CHO cells may be used. In 
order to introduce the recombinant vector for expression into host cells, a calcium phosphate method may be used. 

The thus-prepared transformant cells are incubated in a selective medium m accordance wrth the genetic marker 
of the vector, so as to express the LDL receptor analog protein of the present invention. 

(7] Ligand analysis of the protein by ligand blotting: 

After the resultant transformant cells are incubated, the expressed LDL receptor analog protein is solubtlized using 
a solubilizer. e g.. Triton X-100. to thereby obtain a membrane protein fraction. The fraction is separated using SDS- 
PAGE. and transferred onto, for example, a nitrocellulose membrane Using a radio-labelled ( 125 l) lipoprotein as a 
probe, the analog protein can be detected by autoradiography. Exemplary lipoproteins which may be used include p- 
VLDL and LDL 

Examples: 

The present invention will next be described in detail by way of example, which should not be construed as limiting 
the invention 

Example 1 : 

Preparation of a rabbit liver cDNA library: 

From tissue of the liver of a male Japanese white rabbrt, intact RNA was extracted through a guanidium thiocy- 
anate/cesium chloride method. The obtained intact RNA was subjected to an oligo (dT) cellulose column method to 
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thereby obtain purified poly(A) + RNA. 

cDNA was synthesized in accordance with a method of Gubler and Hoffman [Gubler, U and Hoffman, B.J. (1983) 
Gene 25, 263]. Briefly, cDNA was synthesized employing rabbrt liver poly(A)*RNA (as a template), a random primer, 
and moloney murine leukemia virus reverse transcriptase. The synthesized cDNA was transformed into double- 
stranded DNA using DNA polymerase I, and then subjected to an EcoRI methylase treatment By ti e use of T4 DNA 
polymerase, the DNA was blunt-ended. The blunt-ended DNA was ligated to phosphorylated EcoRI linker pd 
(CCGAATTCGG) using a T4 DNA ligase, and the resultant ligated product was subjected to an additional digestion with 
EcoRI. cDNA fragments having a size not less than 1 kb were selected by agarose gel electrophoresis, and integrated 
into the EcoRI-digested site of XgtlO phage DNA using a T4 DNA ligase. The phage DNA was packaged in vitro, to 
thereby establish a rabbit liver cDNA library. 



Example 2: 

Cloning of cDNA of receptors in the rabbit LDL receptor family: 

The cDNA library (1 ,000,000 plaques) prepared in Example 1 was subjected to screening using a plaque hybridi- 
zation method and employing as a probe a segment of the cDNA obtained from a ligand binding region, the functional 
region, of the rabbit LDL receptor. Hybridization was performed at 42°C using 5 x SSC. 30% formanrcde, 1% SDS, 5 x 
Denhardts, and 100 ng/mi salmon sperm DNA (ssDNA), followed by washing with 0.3 x SSC/0.1% SDS at 48°C As a 
result, several positive clones were obtained. These cDNA clones were separated by performing this plaque hybridiza- 
tion method in a plurality of times. Subsequently, a cDNA fragment of each phage was subcloned into a plasmid vector 
pBluescnpt II, and the nucleotide sequence was analyzed using a dideoxy sequence method [Sanger, F , Nicklen, S 
and Coulson, A.R. (1977) Proc. Natl. Acd Sci. USA 74, 5463-5467]. Based on a putative ammo acid sequence, LDL 
receptors themselves were excluded, and cDNA clones having a sequence very similar to that of LDL receptors were 
identified Using these clones as cDNA probes, the cDNA library was screened to thereby obtain overlapping two 
clones. These were employed as new probes and similar procedure was performed, so as to obtain 5 cDNA clones The 
DNA nucleotide sequence determined by these cDNA clones are shown as Sequence ID No. 3. The total length of the 
sequence was 6961 bp. In the open reading frame of 6639 bp (Sequence ID No. 1 ) which contained a sequence exhib- 
iting high homology with LDL receptors, there existed on the 5' side an ATG codon which was presumably a translation 
initiating site and a successive highly hydrophobic sequence consisting of about 30 ammo acids. Accordingly, the 
obtained cDNA was considered to contain the entirety of its length. A putative ammo acid sequence is shown as 
Sequence ID No. 2. The protein consisted of 2213 amino acids. Comparison of the ammo acid sequence of the protein 
with other amino acid sequence data registered at the Genebank, there was a very high similarity to LDL receptors 
That is, ammo acids 700 - i .100 in the sequence were very similar to the EGF precursor homology region of LDL recep- 
tors, and amino acids 1,100 - 1,640 were also very similar to the ligand binding region of LDL receptors When the 
amino acid sequence of the subject protein was compared with other lipoprotein receptor LRP, gp330, and VLDL recep- 
tors, similarity was not as high as that observed for LDL receptors. On the C-terminal side of the ammo acid sequence 
of the protein, there was found a highly hydrophobic region which was very similar to the transmembrane region of LDL 
receptors. 



Example 3: 

From liver tissue and brain tissue of a male Japanese white rabbit, intact RNA was extracted through a guan.dium 
thiocyanate/cesium chloride method. The obtained intact RNA was subjected to an oligo (61) cellulose column method 
to thereby obtain purified poly(A) + RNA. The poly(A) + RNA specimens (10 \xg each) was modrfied via a glyoxal method, 
electrophoresed on 1% agarose gel, and transferred onto a nylon membrane. 

For human tissue mRNA, commercially available nylon membranes blotted with human tissue poly(A)*RNA from 
various sources were used. 

Using as a probe part of a 32 P-labelled rabbit cDNA of the present invention, hybndization was performed at 42 C 
using 50% (rabbit) or 40% (human) formam.de, 0.1% SDS, 50 mM phosphate buffer. 5 x Denhardts, 5 x SSC. and 200 
ug/ml of ssDNA followed by washing with 0 1 x SSC and 0.1% SDS at 50°C Autoradiography was performed at -70°C 
for 2 days in the presence of intensifying screen As a result, in both rabbit liver tissue and brain tissue, mRNA of about 
7 kb was detected as well as mRNA of about 1 5 kb wh.ch was considered to result from alternative spl.c.ng or polyade- 
nylation The size of the mRNA of about 7 kb coincided wrth that of the rabbit cDNA of the present invention. Also, m 
human liver tissue and brain tissue, it was confirmed that mRNA having the same size was expressed. 
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Example 4: 

Screening of human brain cDN A library for positive clones and determination of the nucleotide sequence of cDNA frag- 
ments 

The human brain cDNA library used in this Example was a commercially obtained cDNA library which was con- 
structed using Xgt10 as a vector. Using partial cDNA of the present invention as a probe, screening of the cDNA library 
(300,00X3 plaques) was performed using a plaque hybridization method. Procedures of screening, cloning, and 
sequencing were as described in Example 2 of the present invention. 

As a result of screening of the human brain cDNA library, positive clones containing a DNA fragment of about 3 kb 
were obtained Analysis of the nucleotide sequence of part of the cDNA fragment revealed that the fragment was highly 
homologous to the cDNA of the present invention (Sequence ID No. 4). 

Example 5: 

Cloning of cDNA of receptors m the human LDL receptor family: 

A human brain cDNA library was subjected to screening using fragments of the cDNA of the present invention and 
fragments of the cDNA obtained in Example 4 as probes. Procedures of screening, cloning, and sequencing were as 
described in Example 2 of the present invention. 

Througn screening of the human brain cDNA library, two positive clones containing cDNA fragments of about 6 kb 
and about 3 kb wef e obtained When their nucleotide sequence was analyzed, they were identified to De a cDNA done 
containing tne cDNA nucieotiae sequence obtained in Example 4 and a cDNA clone that overlapped therewith. Using 
part of these cDNAs as probes, procedures similar to those as described above were performed, to thereby obtain 
another cDNA clone The DNA nucleotide sequence indicated by these cDNA clones are shown as Sequence ID No 
7 The total length of tne sequence was 6.843 bp. There was an open reading frame having a size of 6.642 bo 
(Sequence ID No 5) A outative ammo acid sequence is shown as Sequence ID No. 6. The orotein consisted of 2.214 
ammo acids Comparison of the amino acid sequence with that of rabbrt protein shown by Sequence ID No 2 revealed 
high homology of not less than 94%. 

Example 6 

Creation of cells that express receptors m the rabbit LDL receptor family: 

The cDNA as shown by Sequence ID No. 3 was ligated to phosphorylated EcoRl linker pd (CCGAATTCGG) by the 
use of a T4 DNA i.gase. and the resultant ligated product was digested with EcoRl. Separately, a vector for expression. 
pEK-CMV was digested with EcoRl. The aforementioned DNA was ligated to the EcoRI-digested site of the vector using 
a ^4 DNA hgase 

Lsmg tne resultant recombinant expression vector in a calcium phosphate method [Chen, C and H. Okayama 
(1987) Mot Ce-l Biol 7. 2745-2752], host cells (C HO- idi A7) were transformed. The resultant transtormants were incu- 
bated m a Ham s F- 1 2 selective medium supplemented with 500 jig/ml of G418. and viable cells were separated as LDL 
receptor analog protein-expressing cells. The cells were incubated further in the aforementioned medium. 

Example 7 

Ligand analysis of the LDL receptor analog protein by ligand blotting: 

The obtaired LDL receptor analog protein-expressing cells and control cells were suspended in a buffer solution 
containing 200 mM Tns-maleic acid (pH 6.5), 2 mM calcium chloride. 0.5 mM PMSF 2.5 \iM leupeptin, and 1% Triton 
X-100. to thereby solubihze the membrane protein. Solubilized membrane protein fractions were obtained through cen- 
tnfugation. and electrophoresed by a 4 5-18% gradient SDS-PAGE Thereafter, the protein was transferred onto a nitro- 
cellulose membrane 

Incubation was performed in a buffer of 50 mM Tris-HCI (pH 8.0) containing 125 l-labe«ed p-VLDL (10 jig/ml). 2 mM 
calcium chloride, and 5% bovine serum albumin. Autoradiography was performed at room temperature. 

A single band of about 250 kDa was detected in membrane protein fractions prepared using the present protein- 
expressing celts. This size coincided well with the molecular weight of 248 kDa calculated regarding the ammo acid 
sequence (Sequence ID No. 2) deduced from the cDNA of the present invention Although a similar band was detected 
for control cells, the expression level was much lower as compared with the case of the present protein-expressing cells. 

Since the protein coded by the cDNA of the present invention is considered to be a novel LDL receptor family recep- 



6 



EP 0 773 290 A2 

tor. it is expected that through analyses of this protein, details of lipoorctein metabolism mediated by the membrane 
receptor will be elucidated, and pathology of abnormal lipid metabolism which triggers onset and progress of arterio- 
sclerosis will be clarified. 

Sequence ID No. 1 

Length of the Sequence: 6639 

Type: nucleic acid 

Strandedness : double 

Topology: linear 

Molecular type: cDNA to mRNA 

Sequence : 

ATGGCGACAC GGAGCAGCAG GAGGGAGTCG CGACTCCCCT TCCTATTCAC CCTGGTCGCG 60 

CTGCTGCCGC CCGGGGCTCT CTGCGAGGTG TGGACGCGGA CACTGCACGG CGGCCGCGCG ;20 

CCCTTACCCC AGGAGCGGGG CTTCCGCGTG GTGCAGGGCG ACCCGCGCGA GCTGCGGCTG '.80 

TGGGAGCGCG GGGATGCCAG GGGGGCGAGC CGGGCGGACG AGAAGCCGCT CCGGAGGAGA 24C 

CGGAGCGCTG CCCTGCAGCC CGAGCCCATC AAGGTGTACG GACAGGTCAG CCTCAATGAT 30G 

TCCCACAATC AGATGGTGGT GCACTGGGCC GGAGAGAAAA GCAACGTGAT CGTGGCCTTG 360 

GCCCGGGACA GCCTGGCGTT GGCCAGGCCC AGGAGCAG7G ATGTGTACGT GTCTTATGAC 420 

TATGGAAAAT CATTCAATAA GATTTCAGAG AAATTGAACT TCGGCGCGGG AAATAACACA 480 

GAGGCTGTGG TGGCCCAGTT CTACCACAGC CCTGCGGACA ACAAACGGTA CATCTTCGCA 540 

GATGCCTACG CCCAGTATCT CTGGATCACG TTTGACTTCT GCAACACCAT CCATGGCTTT 600 

TCCATCCCGT TCCGGGCAGC TGATCTCCTA CTCCACAG m A AGGCCTCCAA CCTTCTCCTG 660 

GCCTTCCACA GGTCTCACCC CAACAAGCAG CTGTGGAAGT CGGATGATTT TGGCCAGACC 720 

TGGATCATGA TTCAAGAACA CGTGAAGTCC TTTTCTTGGG GAATTGATCC CTATGACAAA 730 

CCAAACACCA TCTACATCGA ACGGCACGAA CCTTCTGGCT ACTCCACGGT TT7CCGAAGT 840 

ACAGACTTCT TGCAGTCCCG GGAAAACCAG GAAGTGATCT TGGAGGAAGT GAGAGACTTT 900 

CAGCTTCGGG ACAAGTACAT GTTTGCTACA AAGGTGGTGC ATCTCTTGGG CAGTCCACTG 960 

CAGTCTTCTG TCCAGCTCTG GGTCTCCTTT GGCCGGAAGC CCATGCCGGC CGCCCAGTTT 1C20 

GTTACAAGAC ATCCTATCAA CGAATATTAC ATCGCGGATG CCTCGGAGGA CCAGGTGTTT 1C80 

GTGTGTGTCA GTCACAGCAA CAACCGCACC AACCTCTACA TCTCGGAGGC AGAGGGCTTG 1140 

AAGTTCTCTC TGTCCCTGGA GAACGTGCTC TACTACACCC CGGGAGGGGC CGGCAGTGAC 1200 

ACCTTGGTGA GGTACTTTGC AAATGAACCG TTTGCTGACT TCCATCGTGT GGAAGGGTTG 1260 
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CACGCAGTCT ACATTGC7AC TCTGATTAAT 
GTCATCACfT TTGACAAACG GGGCACCTGG 
TATGGAGAGA AAATCAACTG TGAGCTGTCC 
CTCAGCCAGC TGCTCAACCT CCAGCTCCGG 
CCTGGCCTCA TCATTGCCAC GGGCTCAGTG 
TACATCTCTA GCAGTGCTGG AGCCAGGTGG 
ACATGGGGAG ACCATGGCGG CATCATCATG 
CTGAAGTACA GTACCAACGA AGGGGAGACC 

GTGTTTGTGT ATGGGCTCCT CACGGAACCC 
GGCTCCAACA AGGAGAACGT GCACAGCTGG 
CTGGGGGTTC CTTGCACAGA GAACGACTAC 
AATGAGTGT7 TGCTTGGACA CAAGACTGTT 
TTAACGGAG AAGACTHGA CAGGCCGGTG 
GACTATGAGT GTGACTTGG CTTCCGGATG 
CCAGATCCAG GATTTTCTGG AAAGTCCTCC 
"ACAGGCGAT CAAGAGGCTA CCGGAAGATT 
GAGGCACGGC TAGAAGGAGA GCTGGTCCCC 
CTGTACGCCA CGCGCAAGTC CATCCACCGC 
-TCCCCCTCA CTGGGTTGCG GGCAGCAGTG 
CTGTATTGG" CTGACCTGGC CTTGGACGTC 
GGACAAGAGG TGATCATCAA CTCTGACCTG 
CTCAGCCAAT TACTTTACTG GGTGGACGCA 
GATGGTGAC~ TCCGACTCAC CGTCGTCAAT 
GTCCTTGTGC CCCAAGAAGG GATCATGTTC 
ATTTATCGGA GCAACATGGA CGGATCTGCC 
m GGCCCAATG GCATTTCCGT GGACGATCAG 
TGCATTGAGC GCATCACGTT CAGCGGCCAG 
CACCCCTATG CCATTGCTGT CTTTAAGAAT 



GGTTCTATGA ATGAGGAGAA CATGAGATCT 


1320 


GAATTTCTSC AGGCTCCAGC CTTCACGGGG 


1380 


GAGGGCTGTT CCCTCCACCT GGCCCAGCGC 


1440 


AGGATCCCCA TCCTGTCCAA 


GGAGTCGGCG 


1500 


GGAAAGAACT TGGCTAGCAA 


GACAAACGTG 


1560 


CGAGAGGCAC TTCCTGGACC 


TCACTACTAT 


1520 


GCCATTGCCC AAGGCATGGA 


AACCAACGAA 


i580 


TGGAAAGCCT TCACCTTCTC 


TGAGAAGCCC 


1740 


GGCGAGAAGA GCACGGTCTT 


CACCATCTTT 


1300 


CTCATCCTCC AGGTCAATGC 


CACAGACGCC 


1860 


AAGCTCTGGT caccatctga 


TGAGCGGGGG 


1920 


TTCAAACGGA GGACCCCGCA 


CGCCACATGC 


1980 


GTTGTGTCGA actgctcctg 


CA2CCGGGAG 


2040 


AGTGAAGACT TCGCATTAGA 


GGTGTGTGTT 


2100 


CCTCCAGTGC CTTGTCCCGT GGGCTCTACG 


2:60 


TCTGGGGACA CCTGTAGTGG 


AGGAGATGTT 


2220 


TGTCCCCTGG CAGAAGAGAA 


CGAGTTCATC 


2280 


TATGACCTGG CTTCCGGAAC 


CACGGAGCAG 


2340 


GCCCTGGACT TTGACTATGA 


GCACAACTGC 


2400 


ATCCAGCGCC TCTGTTTGAA 


CGGGAGTACA 


2460 


GAGACGGTAG AAGCTTTGGC 


TTTTGAACCC 


2520 


GGCTTTAAAA AGATCGAGGT AGCCAATCCA 


2580 


TCCTCGGTGC TGGATCGGCC CCGGGCCCTG 


2640 


TGGACCGACT GGGGAGACCT GAAGCCTGGG 


2700 


GCCTATCGCC TCGTGTCGGA 


GGATGTGAAG 


2760 


TGGATCTAGT GGACGGATGC 


CTACCTGGAC 


2820 


CAGCGCTCCG TCATCCTGGA CAGACTCCCG 


2880 


GAGATTTACT GGGATGACTG GTCACACCTC 


2940 
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AGCATATTCC GAGCTTCTAA GTACAGCGGG TCCCAGATGG AGATTCTGCC CAGCCAGCTC 3000 

ACGGGGCTGA TGGACATGAA GATCTTCTAC AAGGGGAAGA ACACAGGAAG CAATGCGTGT 3060 

GTACCCAGGC CGTGCAGCCT GCTGTGCCTG CCCAGAGCCA ACAACAGCAA AAGCTGCAGG 3120 

TGTCCAGATG GCGTGGCCAG CAGTGTCCTC CCTTCCGGGG ACCTGATGTG TGACTGCCCT 3180 

AAGGGCTACG AGCTGAAGAA CAACACGTGT GTCAAAGAAG AAGACACCTG TCTGCGCAAC 3240 

CAGTACCGCT GCAGCAACGG GAACTGCATC AACAGCATCT GGTGGTGCGA TTTCGACAAC 3300 

GACTGCGGAG ACATGAGCGA CGAGAAGAAC TGCCCTACCA CCATCTGCGA CCTGGACACC 336C 

CAGTTCCGTT GCCAGGAGTC TGGGACGTGC ATCCCGCTCT CCTACAAATG TGACCTCGAG 3420 

GATGACTGTG GGGACAACAG TGACGAAAGG CACTGTGAAA TGCACCAGTG CCGGAGCGAC 348C 

GAATACAACT GCAGCTCGGG CATGTGCATC CGCTCCTCCT GGGTGTGCGA CGGGGACAAC 3540 

GACTGCAGGG ACTGGTCCGA CGAGGCCAAC TGCACAGCCA TCTATCACAC CTCTCAGCCC 3600 

TCCAACTTCC AGTGCCGCAA CGGGCACTGC ATCCCCCAGC GGTGGGCGTG TGACGGCGAC 3660 

GCCGACTGCC AGGATGGCTC TGATGAGGAT CCAGCCAACT GTGAGAAGAA 3TGCAACGGC 3720 

TTCCGCTGCC CGAACGGCAC CTGCATTCCC TCCACCAAGC ACTGTGACGG CCTGCACGAT 3780 

TCCTCGGACG GCTCCGACGA GCAGCACTGC GAGCCCCTGT GTACACGGTT CATGGACTTC 3840 

GTGTGTAAGA ACCGCCAGCA GTGCCTCTTC CACTCCATGG TGTGCGATGG GATCATCCAG 3900 

TGCCGTGACG GCTCCGACGA GGACCCAGCC TTTGCAGGAT GCTCCCGAGA CCCCGAGTTC 3960 

CACAAGGTGT GCGATGAGTT CGGCTTCCAG TGTCAGAACG GCGTGTGCAT CAGCTTGATC 4C20 

TGGAAGTGCG ACGGGATGGA TGACTGCGGG GACTACTCCG ACGAGGCCAA CTGTGAAAAC 4080 

CCCACAGAAG CCCCCAACTG CTCCCGCTAC TTCCAGTTCC GGTGTGACAA TGGCCACTGC 4140 

ATCCCCAACA GGTGGAAGTG TGACAGGGAG AATGACTGTG GGGACTGGTC CGACGAGAAG 4200 

GACTGTGGAG ATTCACATGT ACTTCCGTCT ACGACTCCTG CACCCTCCAC GTGTCTGCCC 4260 

AATTACTACC GCTGCGGCGG GGGGGCCTGC GTGATAGACA CGTGGGTTTG TGACGGGTAC 4320 

CGAGATTGCG CAGATGGATC CGACGAGGAA GCCTGCCCCT CGCTCCCCAA TGTCACTGCC 4380 

ACCTCCTCCC CCTCCCAGCC TGGACGATGC GACCGATTTG AGTTTGAGTG CCACCAGCCA 4440 

AAGAAGTGCA TCCCTAACTG GAGACGCTGT GACGGCCATC AGGATTGCCA GGATGGCCAG 4500 

GACGAGGCCA ACTGCCCCAC TCACAGCACC TTGACCTGCA TGAGCTGGGA GTTCAAGTGT 4560 

GAGGATGGCG AGCCCTGCAT CGTGCTGTCA GAACGCTGCG ACGGCTTCCT GGACTGCTCA 4620 
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GATGAGAGCG ACGAGAAGGC CTGCAGTGAT 
CAGTGGACAG CTGACTTCTC TGGGAATGTC 
CCCTCTGCTG CTTGTGTATA CAACGTGTAC 
ACTCTGGAGA CTCACAGCAA TAAGACAAAC 
ACCTACCAGG TTAAAGTGCA GGTXAGTGC 
GTGACCTTCA GAACTCCAGA GGGATTGCCA 
CACGGGGAAG AGGAAGGTGT GATTGTGGGC 
CTCATTCGCG AATACATTGT AGAGTATAGC 
AGGGCTGCTA GTAACTTTAC AGAAATAAAG 
AGAGTGGCTG CGCTGACGAG TCG7GGGATA 
ACCGTGAAAG GAAAAGCGAT CCCGCCACCA 
TCCCTGAGTT TTACCCTGAC CGTGGATGGG 
CTTTTCTGGG CATTTGACAC CCACAAACAA 
TCAGTGTCCC ACAAAGTTGG CAATCTGACA 
CCCAA3ACTG ACTTGGGCGA TAGTCCTCTG 
CGCCCACCTG CTCCTAGCCT CAAGGCCAGG 
TGCACAGGCC CCAGGAATGT GGTGTATGGC 
TACCGGAACC CAAGCAGCCT GACCACGCCG 

:atgagcact atctgtttct ggtccgggtg 
tacgtggtcg tgaagatgat cccggacagc 
:a:ag:ggca agacctcggc cgtcatcaag 

GACCTGTTCT ATGCGATCGC AGTTAAAGAT 
GTCAAGTCCC GCAACAGCAC CGTGGAGTAC 
TACCAGGTCA TTGTGCAGCT GGGGAACATG 
GTTTCGTTAT CGGCACCCGA TGCCTTAAAA 
TTGTGGAAAA GTCTAGCTCT AAAGGAAAAG 
CACATGTTTG ATAGCGCCAT GAATATCACC 
TTTAAAATTT CCAACCTGAA GATGGGTCAC 



GAGTTAACTG 


TATACAAAGT 


ACAGAATCT7 


4580 


ACTTTGACCT 


GGATGCGGCC 


CAAAAAAATG 


4740 


TATAGAGTTG 


TTGGAGAGAG 


CATATGGAAG 


4800 


ACTGTATTAA 


AAGTGTTGAA 


ACCAGATACC 


4860 


CTGAGCAAGG 


TGCACAACAC 


CAATGACTTT 


4920 


GACGCCCCTC 


AGAACCTCCA 


GCTGTCGCTC 


4980 


CACTGGAGCC 


CTCCCACCCA 


CACCCACGGC 


5040 


AGGAGTGGTT 


CCAAGGTGTG 


GACTTCAGAA 


5100 


AACTTGTTGG 


TCAACACCCT 


GTACACCGTC 


5160 


GGAAACTGGA 


GCGATTCCAA 


ATCCATTACC 


5220 


AATATCCACA 


TTGACAACTA 


CGATGAAAAT 


5280 


AACATCAAGG 


TGAA m GGCTA 


TGTGGTGAAC 


5340 


GAGAAGAAAA 


CCATGAACTT 


CCAAGGGAGC 


5400 


GCACAGACGG 


CCTATGAGAT 


TTCCGCCTGG 


5460 


TCATTTGAGC 


ATGTCACGAC 


CAGAGGGGTT 


5520 


GCTATCAATC 


AGACTGCAGT 


GGAATGCACC 


5580 


ATTTTCTATG 


CCACATCCTT 


CCTGGACCTC 


5640 


CTGCACAACG 


CAACCGTGCT 


CGTCGGTAAG 


5730 


GTGATGCCCT 


ACCAAGGGCC 


GTCCTCGGAC 


5760 


AGGCTTCCTC 


CCCGGCACCT 


GCATGCCGTT 


5820 


TGGGAGTCGC 


CCTACGACTC 


TCCTGACCAG 


5880 


CTGATACGAA 


AGACGGACCG 


GAGCTACAAA 


5940 


ACCCTGAGCA 


AGCTCGAGCC 


CGGAGGGAAA 


6000 


AGCAAAGATG 


CCAGTGTGAA 


GATCACCACC 


6060 


ATCATAACAG 


AAAATGACCA 


CGTCCTTCTC 


6120 


TATTTTAACG 


AAAGCAGGGG 


CTACGAGATA 


6180 


GCATACCTTG 


GGAATACTAC 


TGACAATTTC 


6240 


AATTACACAT 


TCACGGTCCA 


GGCACGATGC 


6300 
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CTTTTGGGCA GCCAGATCTG CGGGGAGCCT GCCGTGCTAC TGTATGATGA GCTGGGGTCT 6360 
GGTGGCGATG CGTCGGCGAT GCAGGCTGCC AGGTCTACTG ATGTCGCCGC CGTGGTGGTG 6420 
CCCATCCTGT TTCTGATACT GCTGA.GCCTG GGGGTCGGGT TTCCCATCCT GTACACGAAG 6480 
CATCCGAGGC TGCAGAGCAG CTTCACCGCC TTCGCCAACA GCCACTACAG CTCCAGACTC 6540 
GGCTCCGCCA TCTTCTCCTC TCGGGATGAC TTGGGGGAGG ATGATGAACA TGCTCCTATG 6600 
ATCACTGGAT TTTCGGACGA CGTCCCCATG GTGATAGCC 6639 
Sequence ID No. 2 
Length of the Sequence: 2213 
Type: amino acid 
Topology: linear 
Molecular type: Protein 
Sequence : 

Met Ala Thr Arg Ser Ser Arg Arg Glu Ser Arg Leu Pro Phe Leu Phe 

5 10 15 

Thr Leu Val Ala Leu Leu Pro Pro Gly Ala Leu Cys Glu Val Trp Thr 

20 25 30 

Arg Thr Leu His Gly Gly Arg Ala Pro Leu Pro Gin Glu Arg Gly Phe 

35 40 45 

Arg Val Val Gin Gly Asp Pro Arg Glu Leu Arg Leu Trp Glu Arg Gly 

50 55 60 

Asp Ala Arg Gly Ala Ser Arg Ala Asp Glu Lys Pro Leu Arg Arg Arg 
65 70 75 80 

Arg Ser A:a Ala Leu Gin Pro Glu 3 ro He Lys Val Tyr Gly Gin Val 

85 90 95 

Ser Leu Asn Asp Ser His Asn Gin Met Val Val His Trp Ala Gly Glu 

100 105 110 

Lys Ser Asn Val lie Val Ala Leu Ala Arg Asp Ser Leu Ala Leu Ala 
115 120 125 
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Arg Pro Arg Ser Ser Asp Va] Tyr Val Ser Tyr Asp Tyr Gly Lys Ser 

130 135 140 

Phe Asn Lys lie Ser Glu Lys Leu Asn Phe Gly Ala Cly Asn Asn Thr 
145 150 155 160 

Glu Ala Val Val Ala Gin Phe Tyr His Ser Pro Ala Asp Asn Lys Arg 

165 173 175 

Tyr He Phe Ala Asp Ala "yr Ala Gin Tyr Leu Trp He Thr Phe Asp 

180 185 190 

Phe Cys Asn Thr lie His Gly Phe Ser lie Pro Phe Arg Ala Ala Asp 

195 200 205 

Leu Leu Leu His Ser Lys Ala Ser Asn Leu Leu Leu Gly Phe Asp Arg 

21D 215 220 

Ser His Pro Asn Lys Gin Leu Trp Lys Ser Asp Asp Phe Gly Gin Thr 
225 230 235 240 

Trp He Met He Gin Glu His Val Lys Ser Phe Ser Trp Gly lie Asp 

245 250 255 

Pro Tyr Asp Lys Pro Asn Thr He Tyr He Glu Arg His Glu Pro Ser 

260 265 270 

Gly Tyr Ser Thr Val Phe Arg Ser Thr Asp Phe Phe Gin Ser Arg Glu 

275 280 285 

Asn Gin Glu Val He Leu Glu Glu Val Arg Asp Phe Gin Leu Arg Asp 

290 295 300 

Lys Tyr Met Phe A: a Thr Lys Val Vai His Leu Leu Gly Ser Pro Leu 
3C5 310 315 320 

Gin Ser Ser Val Gin Leu Trp Val Ser Phe Gly Arg Lys Pro Met Arg 

325 330 335 

Aia Ala Gin Phe Val Thr Arg His Pro He Asn Glu Tyr Tyr He Ala 
340 345 350 
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Asp Ala Ser GIu Asp Gin Val Phe Val Cys Val Ser His Ser Asn Asn 

355 360 365 

Arg Thr Asn Leu Tyr He Ser Glu Ala Glu Gly Leu Lys Phe Ser Leu 

370 375 380 

Ser Leu Glu Asn Val Leu Tyr Tyr Thr Pro Gly Gly Ala Gly Ser Asp 
335 390 395 400 

Thr Leu Val Arg Tyr Phe Ala Asn Glu Pro Phe Ala Asp Phe His Arg 

405 410 415 

Val Glu Gly Leu Gin Gly Val Tyr He Ala Thr Leu lie Asn Gly Ser 

420 425 430 

Met Asn Glu Glu Asn Met Arg Ser Val He Thr Phe Asp Lys Gly Gly 

435 440 445 

Thr Trp Glu Phe Leu Gin Ala Pro Ala Phe Thr Gly Tyr Gly Glu Lys 

450 455 460 

He Asn Cys Glu Leu Ser Glu Gly Cys Ser Leu His Leu Ala Gin Arg 
465 470 475 480 

Leu Ser Gin Leu Leu Asn Leu Gin Leu Arg Arg Met Pro He Leu Ser 

485 490 495 

Lys Glu Ser Ala Pro Gly Leu He He Aia Thr Gly Ser Val Gly Lys 

500 505 510 

Asn Leu Ala Ser Lys Thr Asn Val Tyr He Ser Ser Ser Aia Gly Ala 

515 520 525 

Arg Trp Arg Glu Ala Leu Pro Gly Pro His Tyr Tyr Thr Trp Gly Asp 

530 535 540 

His Gly Gly He He Met Ala He Ala Gin Gly Met Glu Thr Asn Glu 
545 550 555 560 

Leu Lys Tyr Ser Thr Asn Glu Gly Glu Thr Trp Lys Ala Phe Thr Phe 
565 570 575 
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Ser Glj Lys Pro Val Phe Va! Tyr Gly Leu Lej Thr Glu Pro Gly Glu 

580 535 590 

Lys Ser Thr Val Phe Thr He Phe Gly Ser Asn Lys Glu Asn Val His 

595 600 605 

Ser Trp Leu He Leu Gin val Asn Ala Thr Asp Ala Leu Gly Val Pro 

610 515 62C 

Cys Thr Glu Asn Asp Tyr Lys Leu Trp Ser Pro Ser Asp Glu Arg Gly 
625 630 635 540 

Asn 31 u Cys Leu Leu Gly His Lys Thr Val Phe Lys Arg Arg Thr 3 ro 

645 650 655 

His Ala Thr Cys Phe Asn Gly Glu Asp Phe Asp Arg Pro Val Val Vai 

660 665 6 7 0 

Ser Asn Cys Ser Cys Thr Arg Glu Asp Tyr Glu Cys Asp Phe Gly Phe 

675 680 685 

Arg Met Ser Glu Asp Leu Ala Leu Glu Val Cys Val Pre Asp Pro Gly 

690 695 700 

Phe Ser Gly Lys Ser Ser Pro Pro Val Pro Cys Pro Val Gly Ser Thr 
7C5 710 715 "23 

Tyr Arg Arg Ser Arg Gly Tyr Arg Lys Le Ser Gly Asp Thr Cys Ser 

725 "30 735 

Gly Gly Asp Val Glu Ala Arg Leu Glu Giy Glu _eu Val Pro Cys Pro 

740 745 750 

Leu Ala Glu Glu Asn Glu Phe He Leu Tyr Ala Thr Arg Lys Ser lie 

755 76C 765 

His Arg "Vr Asp Leu Ala Ser Gly Thr Thr Glu Gin Leu Pro Leu Thr 

770 775 780 

Gly Leu Arg Ala Ala Val Ala Leu Asp Phe Asp Tyr Glu His Asn Cys 
785 790 795 800 
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Leu Tyr Trp Ser Asp Leu Ala Leu Asp Val He Gin Arg Leu Cys Leu 

805 810 815 

Asn Gly Ser Thr Gly Gin Glu Val He He Asn Ser Asp Leu Glu Thr 

820 825 830 

Val Glu Ala Leu Ala Phe Glu Pro Leu Ser Gin Leu Leu Tyr Trp Val 

835 840 845 

Asp Ala Gly Phe Lys Lys He Glu Val Ala Asn Pro Asp Gly Asp Phe 

850 855 560 

Arg Leu Thr Val Val Asn Ser Ser Val Leu Asp Arg Pro Arg Ala Leu 
365 870 875 880 

Val Leu Val Pro Gin Glu Gly He Met Phe Trp Thr Asp Trp Gly Asp 

885 890 895 

Leu Lys Pro Gly He Tyr Arg Ser Asn Met Asp Gly Ser Ala Ala Tyr 

900 905 910 

Arg Leu Val Ser Glu Asp Val Lys Trp Pro Asn Gly lie Ser Val Asp 

915 920 925 

^sp Gin Trp He Tyr Trp Thr Asp Ala Tyr Leu Asp Cys He Glu Arg 

930 935 940 

He Thr Phe Ser Gly Gin Gin Arg Ser Val He Leu Asp Arg Leu Pro 
945 950 955 960 

His Pre Tyr Ala He Ala Val Phe Lys Asn Glu He Tyr Trp Asp Asp 

965 970 975 

Trp Ser Gin Leu Ser He Phe Arg Ala Ser Lys Tyr Ser Gly Ser Glr. 

980 985 9S0 

Met Glu lie Leu Ala Ser Gin Leu Thr Gly Leu Met Asp Met Lys lie 

995 1000 1005 

Phe Tyr Lys Gly Lys Asn Thr Gly Ser Asn Ala Cys Val Pro Arg Pro 
1010 1015 1C20 
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Cys Ser Leu Leu Cys Leu Pro Arg Ala Asn Asn Ser Lys Ser Cys Arg 
1025 1030 1035 1 040 

Cys Pro Asp Gly Val Ala Ser Ser Val Leu Pro Ser Gly Asp Leu Met 

1045 105C 1055 

Cys Asp Cys Pro Lys Gly Tyr Glu Leu Lys Asn Asn Thr Cys Val Lys 

1060 1065 1070 

Glu Glu Asp Thr Cys Leu Arg Asn Gin Tyr Arg Cys Ser Asn Gly Asn 

1075 1080 1085 

Cys lie Asn Ser lie Trp Trp Cys Asp Phe Asp Asn Asp Cys Gly Asp 

1090 1095 1100 

Met Ser Asp Glu Lys Asn Cys Pro Thr Thr He Cys Asp Leu Asp Thr 
HQ5 lllQ lll5 U20 

Gin Phe Arg Cys Gin Glu Ser Gly Thr Cys He Pro Leu Ser Tyr Lys 

1 1 25 H30 H35 

Cys Asp Leu Clu Asp Asp Cys Cly Asp Asn Ser Asp Glu Arg His Cys 

1 1 43 l L 45 il50 

Glu Met His Gin Cys Arg 3er Asp Glu Tyr Asn Cys Ser Ser Gly Met 

.155 1160 1165 

Cys !!e Arg Ser Ser Trp Val Cys Asp Gly Asp Asn Asp Cys Arg Asp 

1170 1175 1180 

"tp Ser Asp Glu Ala Asn Cys Thr Ala lie Tyr His Thr Cys Glu Ala 
1135 1190 1195 1200 

Ser Asn Phe Gin Cys Arg Asn Gly H.s Cys Me Pro Gin Arg Trp Ala 

1205 1210 1215 

Cys Asp Gly Asp Ala Asp Cys Gin Asp Gly Ser Asp Glu Asp Pro Ala 

1220 1225 1230 

Asn Zys Glu Lys Lys Cys Asn Gly Phe Arg Cys Pro Asn Gly Thr Cys 
1235 1240 1245 
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lie Pro Ser Thr Lys His Cys Asp Gly Leu His Asp Cys Ser Asp Gly 

1250 1255 1260 

Ser Asp Glu Gin His Cys Glu Pro Leu Cys Thr Arg Phe Met Asp Phe 
1265 1270 1275 1230 

Val Cys Lys Asn Arg Gin Gin Cys Leu Phe His Ser Met Val Cys Asp 

1285 1290 1295 

Gly He He Gin Cys Arg Asp Gly Ser Asp Glu Asp Pro Ala Phe Ala 

1300 1305 1310 

Gly Cys Ser Arg Asp Pro Glu Phe His Lys Val Cys Asp Glu Phe Gly 

1315 1320 1325 

Phe Gin Cys Gin Asn Gly Val Cys He Ser Leu lie Trp Lys Cys Asp 

1330 1335 1240 

Gly Met Asp Asp Cys Gly Asp Tyr Ser Asp Glu Ala Asn Cys Glu Asn 
1345 1350 1355 1350 

Pro Thr Glu Ala Pro Asn Cys Ser Arg Tyr Phe Gin °he Arg Cys Asp 

1365 1370 1375 

Asn Gly His Cys He Pro Asn Arg Trp Lys Cys Asp Arg Glu Asn Asp 

1380 1335 1390 

Cys Gly Asp Trp Ser Asp Glu Lys Asp Cys Gly Asp Ser His Val Leu 

1395 1400 1405 

Pro Ser Thr Thr Pro Ala Pro Ser Thr Cys Leu Pro Asn Tyr Tyr Arg 

1410 1415 1420 

Cys Gly Gly Gly Ala Cys Val He Asp ~hr Trp Val Cys Asp Gly Tyr 
1425 1430 1435 1440 

Arg Asp Cys Ala Asp Gly Ser Asp Glu Glu Ala Cys Pro Ser Leu Pro 

1445 :450 1455 

Asn Val Thr Ala Thr Ser Ser Pro Ser Gin Pro Gly Arg Cys Asp Arg 
1460 1465 1470 
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Phe Glu Phe Clu Cys His Gin Pro Lys Lys Cys He Pro Asn Trp Arg 

1475 1480 1485 

Arg Cys Asp G'.y His Gin Asp Cys Gin Asp Gly Gin Asp Glu Ala Asn 

.490 1495 1500 

Cys Pro Thr His Ser Thr Leu Thr Cys Met Ser Trp Glu Phe Lys Cys 
1505 1510 1515 1520 

Glu Asp Gly Glu Ala Cys lie Val Leu Ser Glu Arg Cys Asp Gly Phe 

1525 1530 1535 

Leu Asp Cys Ser Asp Glu Ser Asp Glu Lys Ala Cys Ser Asp Glu Leu 

1540 1545 1550 

Thr Val Tyr Lys Val Gin Asn Leu Gin Trp "Thr Ala Asp Phe Ser Gly 

1555 1560 1565 

Asn Va! Thr Leu Thr Trp Met Arg Pro Lys Lys Met Pro Ser Ala Ala 

15 7 0 1575 1580 

Cys va. Tyr Asn Val Tyr Tyr Arg Val Val Gly Glu Ser lie Trp Lys 

15a5 1590 1595 1600 

Th- Let Glu Thr His Ser Asn Lys Thr Asn Thr Val Leu Lys Val Leu 

1605 1610 .615 

Lys Pro Asp Thr Thr Tyr Gin Val Lys Val Gin Val Gin Cys Leu Ser 

1620 1625 1630 

Lys Val His Asn Thr Asn Asp Phe Val Thr Leu Arg Thr Pro Glu Gly 

1635 1640 1645 

Leu Pro Asp Ala Pro Gin Asn Leu Gin Leu Ser Leu His Gly Glu Glu 

1650 1655 1660 

Gl u Gly Val He Val Gly His Trp Ser Pro Pro Thr His Thr His Gly 
1655 1670 1675 1680 

Leu lie Arg Glu Tyr lie Va! Glu Tyr Ser Arg Ser Gly Ser Lys Val 
1685 1690 1695 
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Trp Thr Ser Glu Arg Ala Ala Ser Asn Phe Thr Giu He Lys Asn Leu 

1700 '705 1710 

Leu Val Asn Thr Leu Tyr Thr Val Arg Val Ala Ala Val Thr Ser Arg 

1715 1720 1725 

Gly lie Gly Asn Trp Ser Asp Ser Lys Ser lie Thr Thr Val Lys Cly 

1730 1735 1740 

Lys Ala lie Pro Pro Pro Asn He His lie Asp Asn Tyr Asp Glu Asn 
17 45 1750 1755 1760 

Ser Leu Ser Phe Thr Leu Thr Val Asp Gly Asn Me Lys Val Asn Gly 

1765 1770 1775 

Tyr Vai Val Asn Leu Phe Trp Ala Phe Asp Thr His Lys Gin Glu Lys 

1780 1785 1790 

Lys Thr Met Asn Phe Gin Gly Ser Ser Val Ser His Lys Val Gly Asn 

1795 1800 1805 

Leu Thr Ala Gin Thr Ala Tyr Glu He Ser Ala Trp Ala Lys Thr Asp 

1810 1815 1820 

Lej Gly Asp Ser Pro Leu Ser Phe Glu His Val Thr Thr Arg Gly Val 
1825 1830 1835 1840 

Arg Pro Pro Ala Pro Ser Leu Lys Ala Arg Ala He Asn Gin Thr Ala 

1845 1850 1855 

Val Glu Cys Thr Trp Thr Gly Pro Arg Asn Val Val Tyr Gly lie Phe 

I860 1865 1870 

Tyr Ala Thr Ser Phe Leu Asp Leu Tyr Arg Asn Pro Ser Ser Leu Thr 

1875 1380 1885 

Thr Pro Leu H:s Asn Ala Thr Val Leu Val Gly Lys Asp Glu Gin Tyr 

1890 1895 1900 

Leu Phe Leu Val Arg Vai Val Met Pro Tyr Gin Gly Pro Ser Ser Asp 
1905 1910 1915 1920 
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Tyr Val Val Val Lys Me: He Pro Asp Ser Arg Leu Pro Pro Arg His 

1925 1930 1935 

Leu His Ala Val His Thr Gly Lys Thr Ser Ala Val He Lys Tr? Glu 

1940 1945 1950 

Ser Pro Tyr Asp Ser Pro Asp Gin Asp Leu Phe Tyr Ala He Ala Val 

1955 1960 1965 

Lys Asp Leu He Arg Lys Thr Asp Arg Ser Tyr Lys Val Lys Ser Arg 

1970 1975 1980 

Asn Ser Thr Val Glu Tyr Thr Leu Ser Lys Leu Glu Pro Gly Gly Lys 
1985 19S0 1995 20CO 

Tyr His Val lie Va! Gin Leu Gly Asn Met Ser Lys Asp Ala Ser Val 

2005 2010 2015 

~ys He Thr Thr Val Ser Leu Ser Ala Pro Asp Ala Leu Lys He He 

2020 2025 2C30 

Thr Glu Asn Asp His Val Leu Leu Phe Trp Lys Ser Leu Ala Leu Lys 

2035 2040 2045 

Glu Lys Tyr Phe Asn Glu Ser Arg Gly Tyr Glu He His Met Phe Asp 

2050 2055 2060 

Ser Ala Met Asn lie Thr Ala Tyr Leu Gly Asn Thr Thr Asp Asn Phe 
2C65 20~0 2075 20SO 

Phe Lys lie Ser Asn Leu Lys Met Gly His Asn Tyr T hr Phe Thr Val 

2085 2090 2095 

Gin Ala Arg Cys Leu Leu Gly Ser Gin He Cys Gly Glu Pro Ala Va) 

2100 2105 2110 

Leu Leu Tyr Asp Glu Leu Gly Ser Gly Gly Asp Ala Ser Ala Met Gin 

2115 2120 2125 

Ala Ala Arg Ser Thr Asp Val Aia Ala Val Val Val Pro lie Leu Phe 
2130 2135 2140 
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Leu lie Leu Leu Ser Leu Gly Val Gly Phe Ma lie Leu Tyr Thr Lys 
2145 2150 2155 2160 

His Arg Arg Leu Gin Ser Ser Phe Thr Ala Phe Ala Asn Ser Kis Tyr 

2165 2170 2175 

Ser Ser Arg Leu Gly Ser Ala lie Phe Ser Ser Gly Asp Asp Leu Gly 

2180 2185 2190 

Glu Asp Asp Glu Asp Ala Pro Met He Thr Gly Phe Ser Asp Asp Val 

2195 2200 2205 

Pro Met Val lie Ala 
2210 

Sequence ID No. 3 

Length of the Sequence: 6961 

Type: nucleic acid 

Strandedness : double 

Topology: linear 

Molecular type: cDNA to mRNA 

Feature : 

Name/Key: sig peptide 

Location: 178 . . 261 

Identification method: S 

Name/Key: mat peptide 

Location: 262 . . 6816 

Identification method: S 
Sequence : 

CCGCGAGCCG CACACGTGAC GGCGCCGCGC CGCGCCGCGC CGCGCCGAGC GGGACCCAGC 50 
GGCTGCCCGG AGCCCCGGGA GCGGCGCGCG CGCGGCCCCG GCCCCGCCGC TCGGCCGGCG 120 
GCGCGCTGCA CATTCTCTCC TGGCGGCGGC GCCACCTGCA GCCGCGTTCG CCCGAACATG 130 

Met 
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GCG ACA CCG AGC AGC AGG AGC GAG TCG CGA CTC CCC TTC CTA TTC ACC 228 
Ala Thr Arg Ser Ser Arg Arg Glu Ser Arg Leu Pro Phe Leu Phe Thr 

5 10 15 

CTG GTC GCG CTG CTG CCG CCC GGG GCT CTC TGC GAG GTG TGG ACG CGG 275 
Lea Val Ala Leu Leu Pro Pro Gly Ala Leu Cys Glu Val Trp Thr Arg 

20 25 30 

ACA CTG CAC GGC GGC CGC GCG CCC TTA CCC GAG GAG CGG GGC TTC CGC 324 
Thr Leu His Gly Gly Arg Ala Pro Leu Pro Gin Glu Arg Gly Phe Arg 

35 40 45 

GTG GTG CAG GGC GAC CCG CGC GAG C~G CGG GTG TGG GAG CGC GGG GAT 372 
Val Val Gin Gly Asp Pro Arg Glu Leu Arg Leu Trp Glu Arg Gly Asp 
50 55 60 65 

GCC AGG GGG GCG AGC CGG GCG GAC GAG AAG CCG CTC CGG AGG AGA CGG 420 
Ala Arg Gly Ala Ser Arg Ala Asp Glu Lys Pro Leu Arg Arg Arg Arg 

70 75 80 

AGC OCT GCC CTG CAG CCC GAG CCC A~C AAG GTG TAC GGA CAG GTC AGC 463 
Ser Ala Ala Leu Gin Pro Glu Pro He Lys Va! Tyr Gly Gin Val Ser 

35 90 95 

CTC AAT GAT TCC CAC AAT CAG ATC GTG G~C CAC TGG GCC GGA GAG AAA 516 
Leu Asn Asp Ser His Asn Gin Met Val Vai His Trp Ala Giy Glu Lys 

IOC 105 110 

AGC AAC GTG ATC G~G GCC TTG GCC CGG GAC AGC CTG GCG TTG GCC AGG 564 
Ser Asn Val lie Val Ala Leu Ala Arg Asp Ser Leu Ala Leu Ala Arg 

115 120 125 

CCC AGG AGC AGT GAT GTG TAC GTG TCT TAT GAC TAT GGA AAA TCA TTC 612 
Pro Arg Ser Ser Asp Val Tyr Val Ser Tyr Asp Tyr Gly Lys Ser Phe 
130 135 140 145 
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AAT AAG ATT TCA GAG AAA TTG AAC TTC GGC GCG GGA AAT AAC ACA GAG 660 

Asn Lys He Ser Glu Lys Leu Asn Phe Gly Ala Gly Asn Asn Tir Glu 

150 155 160 

GCT GTG GTG GCC CAG TTC TAC CAC AGC CCT GCG GAC AAC AAA CGG TAC 708 
Ala Val Val Ala Gin Phe Tyr His Ser Pro Ala Asp Asn Lys Arg Tyr 

165 170 175 

ATC TTC GCA GAT GCC TAC GCC CAG TAT CTC TGG ATC ACG TTT GAC TTC 756 
lie Phe Ala Asp Ala Tyr Ala Gin Tyr Leu Trp He Thr Phe Asp Phe 

180 185 190 

TGC AAC ACC ATC CAT GGC TTT TCC ATC CCG TTC CGG GCA GCT GAT CTC 304 
Cys Asn Thr lie His Gly Phe Ser lie Pro Phe Arg Ala Ala Asp Leu 

;95 200 205 

CTA CTC CAC AGT AAG GCC TCC AAC CTT CTC CTG GGC TTC GAC AGG TCT 852 
Leu Leu His Ser Lys Ala Ser Asn Leu Leu Leu Gly Phe Asp Arg Ser 
210 215 220 225 

CAC CCC AAC AAG CAG CTG TGG AAG TCG GAT GAT TTT GGC CAG ACC TGG 900 
His Pre Asn Lys Gin Leu Trp Lys Ser Asp Asp Phe Gly Gin Thr Trp 

230 235 240 

ATC ATG ATT CAA GAA CAC GTG AAG TCC ^TT TCT TGG GGA ATT GAT CCC 948 
He Met He Gin Glu His Val Lys Ser Phe Ser Trp Gly He Asp Pro 

245 250 255 

TAT GAC AAA CCA AAC ACC ATC TAC ATC GAA CGG CAC GAA CCT TCT GGC 9S6 
Tyr Asp Lys Pro Asn Thr lie Tyr He Glu Arg His Glu Pro Ser Gly 

260 265 270 

TAC TCC ACG GTT TTC CGA AGT ACA GAC TTC TTC CAG TCC CGG GAA AAC '.044 
Tyr Ser Tnr Val Phe Arg Ser Thr Asp Phe Phe Gin Ser Arg Glu Asn 

275 280 285 

CAG GAA GTG ATC TTG GAG GAA GTG AGA GAC TTT CAG CTT CGG GAC AAG 1092 
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Gin 


Glu 


Val 


lie 


Leu 


Glu 


Glu 


Val 


Arg 


Asp 


Phe 


Gin 


Leu 


Arg 


Asp 


Lys 




290 










295 










300 










305 




TAC 


ATG 


TTT 


GCT ACA 


AAG 


GTG GTG 


CAT 


CTC 


TTG 


GGC 


AGT 


CCA 


CTG 


CAG 


1140 


Tyr 


Met 


Phe 


Ala Thr 


T -ys 


Val 


Val 


His 


Leu 


Leu 


Gly 


Ser 


Pro 


Leu 


Gin 
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TCT 


TC? 


GTC 


CAG 


CTC 


TGG 


GTC TCC 


TTT 


GGC 


CGG 


AAG 


cc: 


ATG 


CGG 


GCC 


1 1 38 


Ser 


Ser 


Val 


Gin 


Leu Trp 


Val 


Ser 


Phe 


Gly 


Arg 


Lys 


Pro 


Met 


Arg 


Ala 
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330 










335 








GCC 


CAG 


TTT 


GTT 


ACA 


AGA 


CAT CCT 


ATC 


AAC 


GAA 


TAT 


TAC 


ATC 


GCG 


GAT 


1236 


Ala 


Gin 


Phe 
340 


Val 


Thr 


Arg 


His 


Pro 
345 


He 


Asn 


Glu 


Tyr 


Tyr 
350 


He 


Ala 


Asp 




GCC 


TCG 


GAG 


GAC 


CAG 


GTG 


TTT 


G~G 


l Li l 


GTC 


AGT 


CAC 


AGC 


AAC 


AAC 


pro 
^-UL, 


1234 


Ala 


Ser 
355 


Glu 


Asp 


Gin 


Val 


Phe 

360 


Val 


Cys 


Val 


Ser 


His 
365 


Ser 


Asn 


Asn 


Arg 
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AAC 


CTC 


TAC 


ATC 


TCG 


GAG 


GCA 


GAG 


GGC 


TTG 


AAG 


TTC 


TCT 


CTG 


TCC 


1332 


Thr 


Asa 


Leu Tyr 


He 


Ser 


Glu 


Ala 


Glu 


Gly 


Leu 


Lys 


Phe 


Ser 


Leu 


Ser 




370 










375 










380 










385 




CTG 


GAG 


AAC 


GTG 


CTC 


TAC 


TAC 


ACC 


CCG 


GGA 


GGG 


GCC 


GGC 


AGT 


GAC 


ACC 


1380 


Leu 


Glu 


Asn 


Val 


Leu 
390 


Tyr 


Tyr 


Thr 


Pro 


Gly 
395 


Gly 


Ala 


Gly 


Ser 


Asp 
400 


Thr 




TT'j 


GTC 


AX 


TAC 


TTT 


GCA 


AAT 


GAA 


CCG 


TTT 


GCT 


GAC 


TTC 


CAT 


CGT 


GTG 


1428 




val 


Arg Tyr 


Phe 


Ala 


Asn 


Glu 


Pro 


Phe 


Ala 


Asp 


Phe 


His 


Arg 


Val 
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410 
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GGG 


TTG 
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GGA 


GTC 


TAC 


ATT 
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ACT 


CTG 


ATT 


AAT 


GGT 


TCT 


ATG 


1476 


j i LI 


Gly 


Leu 


Gin 


Gly Val 


Tyr 


He 


Ala 


Thr 


Leu 


He 


Asn 


Gly 


Ser 


Met 
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ATC 
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TTT 
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AAA 


GGG 


GGC 


ACC 


1524 


Asn 


Glu 


Glu 


Asn 


Met 
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He 


Thr 


Phe 


Asp 


Lys 


Gly 


Gly 


Thr 
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435 440 445 

TGG GAA TTT CTG CAG GCT CCA GCC TTC ACG GGG TAT GGA GAG AAA ATC 1572 
Trp Glu Phe Leu Gin Ala Pro Ala Phe Thr Gly Tyr Gly Glu Lys He 
450 455 460 465 

AAC TGT GAG CTG TCC GAG GGC TGT TCC CTC CAC CTG GCC CAG CGC CTC 1620 
Asn Cys Glu Leu Ser Glu Gly Cys Ser Leu His Leu Ala Gin Arg Leu 

470 475 480 

AGC CAG CTG CTC AAC CTC CAG CTC CGG AGG ATG CCC ATC CTG TCC AAG 1663 
Ser Gin Leu Leu Asn Leu Gin Leu Arg Arg Met Pro i.e Leu Ser Lys 

485 490 495 

GAG TCG GCG CCT GGC CTC ATC ATT GCC ACG GGC TCA GTG GGA AAG AAC 1716 
Glu Ser Ala Pro Gly Leu He He Ala Thr Gly Ser Val Gly Lys Asn 

500 505 510 

TTG GCT AGC AAG ACA AAC GTG TAC ATC TCT AGC AGT GCT GGA GCC AGG 1764 
Leu Ala Ser Lys Thr Asn Val Tyr He Ser Ser Ser Ala Gly Ala Arg 

515 520 525 

TGG CGA GAG GCA CTT CCT GGA CCT CAC TAC TAT ACA TGG GGA GAC CAT 13:2 
Trp Arg Glu Ala Leu Pro Gly Pro His Tyr Tyr Thr Trp Gly Asp His 
530 535 540 545 

GGC GGC ATC ATC ATG GCC ATT GCC CAA GGC ATG GAA ACC AAC GAA CTG 1860 
Gly Gly He He Met Ala He Ala Gin Gly Met Glu Thr Asn Glu Leu 

550 555 560 

AAG TAC AGT ACC AAC GAA GGG GAG ACC TGG AAA GCC TTC ACC TTC TCT 1908 
Lys Tyr Ser Thr Asn Glu Gly Glu Thr Trp Lys Ala Phe Thr Phe Ser 

565 570 575 

GAG AAG CCC GTG TTT GTG TAT GGG CTC CTC ACG GAA CCC GGC GAG AAG 1956 
Glu Lys Pro Val Phe Val Tyr Gly Leu Leu Thr Glu Pro Gly Glu Lys 
580 585 590 
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AGC ACG GTC TTC ACC ATC TTT GGC TCC AAC AAG GAG AAC GTC CAC ACC 2004 

Ser\ Thr Val Phe Thr [!e Phe Gly Ser Asn Lys Giu Asn Val His Ser 

595 600 605 

T*GG CTC ATC C~C CAG GTC AAT GCC ACA GAC GCC C~G GGG GTT CCT TGC 2052 

Trp Leu lie Leu Gin Val Asn Ala Thr Asp Ala Leu Gly Val Pro Cys 

610 615 620 625 

ACA GAG AAC GAC TAC AAG CTC TGG TCA CCA TCT GAT GAG CGG GGG AAT 2100 

Thr Glu Asn Asp Tyr Lys Leu Trp Ser 3 ro Ser Asp Glu Arg Gly Asn 

630 635 640 

GAG TG~ TTG CTT GGA CAC AAG ACT GTT TTC AAA CGG AGG ACC CCG CAC 2148 

Glu Cys Leu Leu Giy His Lys Thr Val 3 he Lys Arg Arg Thr Pro H.s 

645 650 655 

GCC ACA TGC TTT AAC GGA GAA GAC TTT GAC AGG CCG GTG GTT GTG TCC 2196 

Ala Thr Cys Phe Asn Gly Glu Asp Phe Asp Arg Pro Val Val Val Ser 

660 665 673 

AAC TGC TCC TGC ACC CGG GAG GAC TAT GAC TGT GAC TTT GGC TTC CGG 2244 

Asn Cys Ser Cys Thr Arg Glu Asp Tyr Glu Cys Asp Phe Gly Phe Arg 

675 680 685 

ATG AGT CAA GAC TTG GCA TTA GAG GTG TGT GTT CCA GAT CCA GGA 7!T 2292 

Met Ser Glu Asp Leu Ala Leu Glu Val Cys Val Pro Asp Pre Giy Phe 

590 695 700 705 

TCT GGA AAG TCC TCC CCT CCA GTG CCT TGT CCC GTG GGC ~CT ACG TAC 2340 

Ser Gly Lys Ser Ser Pro Pro Val Pro Cys Pro Val Gly Ser Thr Tyr 

710 715 720 
AGG CGA TCA AGA GGC TAC CGG AAG ATT TCT GGG GAC ACC TGT AGT GGA 2388 

Arg Arg Ser Arg Gly Tyr Arg Lys lie Ser Gly Asp Thr Cys Ser Gly 

725 730 ^35 
GGA GAT GTT GAG GCA CGG CTA GAA GGA GAG CTC GTC CCC "GT CCC CTG 2436 
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Gly Asp Val CI u Ala Arg Leu Glu Gly Glu Leu Val Pro Cys Pro Leu 

740 745 750 

GCA GAA GAG AAC GAG TTC ATC CTG TAC GCC ACG CGC AAG TCC ATC CAC 2484 
Ala Glu Glu Asn Glu Phe Me Leu Tyr Ala Thr Arg Lys Ser He His 

755 760 765 

CGC TAT GAC CTG GCT TCC GGA ACC ACG GAG CAG TTG CCC CTC ACT GGG 2532 
Arg Tyr Asp Leu Ala Ser Giy Thr Thr Glu Gin Leu Pro Leu Thr Gly 
770 775 780 785 

TTC CGG GCA GCA GTG GCC CTG GAC TTT GAC TAT GAG CAC AAC TGC CTG 2580 
Leu Arg Ala Ala Val Ala Leu Asp Phe Asp Tyr Glu His Asn Cys Leu 

790 795 800 

TAT TGG TCT GAC CTC GCC TTG GAC GTC ATC CAG CGC CTC TGT TTG AAC 2628 
lyr T-p Ser Asp Leu Ala Leu Asp Val He G!n Arg Leu Cys Leu Asn 

805 810 815 

CGG ACT ACA GGA CAA GAG GTG ATC ATC AAC TCT GAC CTG GAG ACG GTA 2676 
C!y Ser Thr Gly Gin Glu Val He lie Asn Ser Asp Leu Glu Thr Val 

820 825 830 

CAA GCT TTG GCT TTT GAA CCC CTC AGC CAA TTA CTT TAC TGG G~G GAC 2724 
Clu Ala Leu Ala Phe Glu Pro Leu Ser Gin Leu Leu Tyr Trp Val Asp 

835 840 845 

CCA GGC TTT AAA AAG ATC GAG GTA GCC AAT CCA GAT GGT GAC TTC CGA 2772 
Ala Gly Phe Lys Lys He Glu Val Ala Asn Pro Asp Gly Asp Phe Arg 
S50 855 860 865 

CTC ACC GTC GTC AAT TCC TCG GTG CTG GAT CGG CCC CGG GCC CTG GTC 2820 
Leu Thr Val Val Asn Ser Ser Val Leu Asp Arg Pro Arg Ala Leu Val 

870 875 880 

CTT GTG CCC CAA GAA GGG ATC ATG TTC TGG ACC GAC TGG GGA GAC CTG 2868 
Leu Val Pro Gin Glu Gly He Met Phe Trp Thr Asp Trp Gly Asp Leu 
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885 890 895 

AAG CCT GGG ATT TAT CGG AGC AAC ATG GAC GGA TCT GCC GCC TAT CGC 2916 
Lys Pro Gly He Tyr Arg Ser Asn Met Asp Gly Ser Ala Ala Tyr Arg 

900 S05 910 

CTC GTG TCG GAG GAT GTG AAG TGG CCC AAT GGC ATT TCC GTG GAC GAT 2S64 
Leu Val Ser Glu Asp Val Lys Trp Pro Asn Gly He Ser Val Asp Asp 

915 920 925 

CAG TGG ATC TAC TGG ACG GAT GCC TAC CTG GAC ~GC ATT GAG CGC ATC 5012 
Gin Trp lie Tyr Trp Thr Asp Ala Tyr Leu Asp Cys He Glu Arg He 
930 935 940 945 

ACG TTC AGC GGC CAG CAG CGC TCC GTC ATC CTG GAC AGA CTC CCG CAC 3060 
Thr Phe Ser Gly G!n Gin Arg Ser Val He Leu Asp Arg Leu Pro His 

950 955 960 

CCC TAT GCC ATT GCT GTC TTT AAG AAT GAG ATT TAC TGG GAT GAC TGG 3108 
Pro Tyr Ala lie Ala Val Phe Lys Asn Glu He Ty- m rp Asp Asp Trp 

965 S70 975 

TCA CAG CTC AGC ATA TTC CGA GCT TCT AAG TAC AGC GGG TCC CAG ATG 31o6 
Ser Gin ~eu Ser He Phe Arg Ala Ser Lys Tyr Ser Gly Ser Gin Met 

980 985 990 

GAG ATT :TG GCC AGC CAG CTC ACG GGG CTG ATG GAC ATG AAG ATC TTC 3204 
Glu lie Leu Ala Ser Gin Leu Thr Gly Lsu Met Asp Met Lys He Phe 

995 1000 1005 

TAC AAG GGG AAG AAC ACA GGA AGC AAT GCG TGT GTA CCC AGG CCG TCC 3252 
Tyr Lys Gly Lys Asn Thr Gly Ser Asn Ala Cys Val Pro Arg Pro Cys 
1010 1015 102C 1025 

AGC CTG CTG TGC CTG CCC AGA GCC AAC AAC AGC AAA AGC TGC AGG TGT 3300 
Ser Leu Leu Cys Leu Pro Arg Ala Asn Asn Ser Lys Ser Cys Arg Cys 
1030 1035 1040 
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CCA GAT GGC GTG GCC AGC AGT GTC CTC CCT TCC GGG GAC CTG ATG TGT 3348 
Pro Asp Gly Val Ala Ser Ser Val Leu Pre Ser Gly Asp Leu Met Cys 

1045 1050 1055 

GAC TGC CCT AAG GGC TAC GAG CTG AAG AAC AAC ACG TGT GTC AAA GAA 3396 
Asp Cys Pro Lys Gly Tyr G!u Leu Lys Asn Asn Thr Cys Val Lys Glu 

1060 1065 1070 

GAA GAC ACC TGT CTG CGC AAC CAG TAC CGC TGC AGC AAC GGG AAC TGC 3444 
Glu Asp Thr Cys Leu Arg Asn Gin Tyr Arg Cys Ser Asn Gly Asn Cys 

1075 1080 1085 

ATC AAC AGC ATC TGC TGG TGC GAT TTC GAC AAC GAC TGC GGA GAC ATG 3492 
[le Asn Ser He Trp Trp Cys Asp Phe Asp Asn Asp Cys Gly Asp Met 
1090 1095 1100 1105 

AGC GAC GAG AAG AAC TGC CCT ACC ACC ATC TGC GAC CTG GAC ACC CAG 3540 
Ser Asp Glu Lys Asn Cys Pro Thr Thr lie Cys Asp Leu Asp Thr Gin 

1110 1115 1120 

TTC CGT TGC CAG GAG TCT GGG ACG TGC ATC CCG CTC TCC "AC AAA TGT 3538 
Phe Arg Cys Gin Glu Ser Gly Thr Cys lie Pro Leu Ser Tyr Lys Cys 

1125 1130 1135 

GAC CTC GAG GAT GAC TGT GGG GAC AAC AGT GAC GAA AGG CAC TGT GAA 3636 
Asp Leu Glu Asp Asp Cys Gly Asp Asn Ser Asp Glu Arg His Cys Glu 

1140 1145 1150 

ATG CAC CAG TGC CGG AGC GAC GAA TAC AAC TGC AGC TCG GGC ATG TGC 3684 
Met His Gin Cys Arg Ser Asp Glu Tyr Asn Cys Ser Ser Gly Met Cys 

1155 1160 1165 

ATC CGC TCC TCC TGG GTG TGC GAC GGG GAC AAC GAC TGC AGG GAC TGG 3732 
lie Arg Ser Ser Trp Val Cys Asp Gly Asp Asn Asp Cys Arg Asp Trp 
1170 H75 H80 H85 

TCC GAC GAG GCC AAC TGC ACA GCC ATC TAT CAC ACC TGT GAG GCC TCC 3780 
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Ser Asp Glu Ala Asn Cys Thr Ala Me Tyr His Thr Cys GIu Ala Ser 

1 ISO 1195 1200 

AAC TTC CAG TGC CGC AAC GGG CAC ~GC ATC CCC CAG CGG TGG GCG TGT 3823 

Asn ?he Gin Cys Arg Asn Gly His Cys He Pro Gin Arg Trp Ala Cys 

1205 1210 1215 

GAC GGC GAC GCC GAC TGC CAG GAT GGC TCT GAT GAG GAT CCA GCC AAC 3876 

Asp Gly Asp Ala Asp Cys Gin Asp Gly Ser Asp Glu Asp Pro Ala Asn 

1220 1225 1230 

TGT GAG AAG AAG TGC AAC GGC TTC CGC TGC CCG AAC GGC ACC TGC ATT 3924 

Cys Clu Lys Lys Cys Asn Gly Phe Arg Cys Pro Asn Gly Thr Cys He 

1235 1240 1245 

CCC TCC ACC \AG CAC TGT GAC GGC CTC CAC GAT TGC TCC GAC GGC TCC 3972 

Pro Ser Thr Lys His Cys Asp Gly Leu His Asp Cys Ser Asp Gly Ser 
1250 1255 1260 1265 

GAC GAG :*G CAC TGC GAG CCC CTG TGT ACA CGG TTC ATG GAC T~C GTG 4020 

Asp Glu G!n His Cys Glu Pro Leu Cys Thr Arg Phe Met Asp Phe Val 

1270 1275 128C 

T2-T AAG AAC CGC CAG CAG TGC CTC TTC CAC TCC ATG GTG TGC GAT GGG 4068 

Cys Lys A.sn Arg Gin Gin Cys Leu Phe His Ser Met Va! Cys Asp Gly 

1235 1290 1295 

A_ ^ z ... TGC CGT GAC GCC TCC GAC GAG QAC CQX GCC TTT GCA GGA 41Ig 

lie He Gin Cys Arg Asp Gly Ser Asp Glu Asp Pro Ala Phe Ala Gly 

1300 1305 1310 

T2C TCC CGA GAC CCC GAG TTC CAC AAG GTG TGC GAT GAG TTC GGC TTC 4164 

Cys Ser Arg Asp Pro Glu Phe His Lys Val Cys Asp Glu Phe Gly Phe 

12\2 1320 1325 

CAG TGT CAG AAC GGC GTG TGC ATC AGC TTG ATC TGG AAG TGC GAC GGG 4212 

Gin Cys Gin Asn Gly Val Cys He Ser Leu He Trp Lys Cys Asp Gly 
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1330 1335 1340 1345 

ATG GAT GAC TGC GGG GAC TAC TCC GAC GAG GCC AAC TGT GAA AAC CCC 4260 

c 

Met Asp Asp Cys Gly Asp Tyr Ser Asp Glu Ala Asn Cys Glu Asn Pro 
1350 1355 1360 

w ACA GAA GCC CCC AAC TGC TCC CGC TAC TTC CAG TTC CGG TGT GAC AAT 4308 

Thr Glu Ala Pro Asn Cys Ser Arg Tyr Phe Gin Phe Arg Cys Asp Asn 
1365 1370 1375 

« CGC CAC TGC ATC CCC AAC AGG TGG AAG TGT GAC AGG GAG AAT GAC TGT 4356 

Gly His Cys He Pro Asn Arg Trp Lys Cys Asp Arg Glu Asn Asp Cys 

1380 1385 1390 

GGG GAC TGG TCC GAC GAG AAG GAC TGT GGA GAT TCA CAT GTA CTT CCG 4404 
Gly Asp Trp Ser Asp Glu Lys Asp Cys Gly Asp Ser His Val Leu Pro 
1395 1400 1405 

25 

TCT ACG ACT CCT GCA CCC TCC ACG TGT CTG CCC AAT TAC TAC CGC TCC 4452 
Ser Thr Thr Pro Ala Pro Ser Thr Cys Leu Pro Asn Tyr Tyr Arg Cys 
2c uio 1415 1420 1425 

GGC CGG GGG GCC TGC GTG ATA GAC ACG TGG GTT TGT GAC GGG TAC CGA 4500 
Gly Gly Gly Ala Cys Val lie Asp Thr Trp Val Cys Asp Gly Tyr Arg 
" s 1430 1435 1440 

GAT TGC GCA GAT GGA TCC GAC GAG GAA GCC TGC CCC TCG CTC CCC AAT 4548 
Asp Cys Ala Asp Gly Ser Asp Glu Glu Ala Cys Pro Ser Leu Pro Asn 

1445 1450 1455 

GTC ACT GCC ACC TCC TCC CCC TCC CAG CCT GGA CGA TGC GAC CGA TTT 4596 
45 val Thr Ala Thr Ser Ser Pro Ser Gin Pro Gly Arg Cys Asp Arg Phe 

1460 1465 1470 

GAG TTT GAG TGC CAC CAG CCA AAG AAG TGC ATC CCT AAC TGG AGA CGC 4644 
so G| U Pne G lu Cys His Gin Pro Lys Lys Cys lie Pro Asn Trp Arg Arg 

1775 1480 1485 
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TGT GAC GGC CAT CAG GAT TGC CAG GAT GGC CAG GAC GAG GCC AAC TGC 4692 
Cys Asp Gly His Gin Asp Cys Gin Asp Gly Gin Asp 31 u Ala Asn Cys 
14S0 1495 1500 1505 

CCC ACT CAC AGC ACC TTG ACC TGC ATG AGC TGG GAG TTC AAG TGT GAG 4740 
Pro Thr His Ser Thr Leu Thr Cys Met Ser T.-p Glu Phe Lys Cys Glu 

1510 1515 1520 

GAT GGC GAG GCC TGC ATC GTG CTG TCA GAA CGC TGC GAC GGC TTC CTG 4788 
Asp Gly Glu Ala Cys lie Val Leu Ser Glu Arg Cys Asp Gly Phe Leu 

1525 1530 1535 

GAC TGC TCA GAT GAG AGC GAC GAG AAG GCC TGC ACT GAT GAG TTA ACT 4836 
Asp Cys Ser Asp Glu Ser Asp Glu Lys Ala Cys Ser Asp Glu Leu Thr 

1540 1545 1550 

GTA TAC AAA GTA CAG AAT CTT CAG TGG ACA GCT GAC TTC ~CT GGG AAT 4884 
Val Tyr Lys Val Gin Asn Leu Gin Trp Thr Ala Asp Phe Ser Gly Asn 

1555 1560 1565 

GTC ACT TTG ACC TGG ATG CGG CCC AAA AAA ATG CCC TCT GCT GCT TGT 4932 
Val Thr Leu Thr Trp Met Arg Pro Lys Lys Met Pro Ser Ala Ala Cys 
1570 1575 1580 1585 

GTA TAC AAC GTG TAC TAT AGA GTT GTT GGA GAG AGC ATA TGG AAG ACT 4930 
Val Tyr Asn Val Tyr Tyr Arg Val Val Gly Glu Ser He Trp Lys Thr 

1590 1595 1600 

CTG GAG ACT CAC AGC AAT AAG ACA AAC ACT GTA TA AAA GTG TTG AAA 5028 
_eu Glu Thr His Ser Asr. Lys Thr Asn Thr Val Leu Lys Val Leu Lys 

1605 1610 1615 

XA GAT ACC ACC TAC CAG GTT AAA GTG CAG GTT CAG TGC CTG AGC AAG 5076 
Pro Asp Thr Thr Tyr Gin Val Lys Val Gin Val G'.n Cys ~eu Ser Lys 

1620 1625 1630 

GTG CAC AAC ACC AAT GAC TTT GTG ACC TTG AGA ACT CCA GAG GGA TTG 5124 
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Val His Asn Thr Asn Asp Phe Val Thr Leu Arg Thr Pro Glu Gly Leu 

1635 1640 1645 

CCA GAC GCC CCT CAG AAC CTC CAG CTG TCG CTC CAC GGG GAA GAG GAA 
Pro Asp Ala Pro Gin Asn Leu Gin Leu Ser Leu His Gly Glu Glu Glu 
1650 1655 1660 1665 

GGT GTG ATT GTG GGC CAC TGG AGC CCT CCC ACC CAC ACC CAC GGC CTC 
Gly Val lie Val Gly His Trp Ser Pro Pro Thr His Thr His Gly Leu 

1670 1675 1680 

ATT CGC GAA TAC ATT GTA GAG TAT AGC AGG AGT GGT TCC AAG GTG TGG 
He Arg Glu Tyr He Val Glu Tyr Ser Arg Ser Gly Ser Lys Val Trp 

1685 1690 1695 

ACT TCA GAA AGG GCT GCT AGT AAC TTT ACA GAA ATA AAG AAC TTG TTG 
Thr Ser Glu Arg Ala Ala Ser Asn Phe Thr Glu He Lys Asn Leu Leu 

1700 1705 1710 

GTC AAC ACC CTG TAC ACC GTC AGA GTG GCT GCG GTG ACG AGT CGT GGG 
Val Asn Thr Leu Tyr Thr Val Arg Val Ala Ala Val Thr Ser Arg Gly 

1715 1720 1725 

ATA GGA AAC TGG AGC GAT TCC AAA TCC ATT ACC ACC GTG AAA GGA AAA 
He Gly Asn Trp Ser Asp Ser Lys Ser He Thr Thr Val Lys Gly Lys 
1730 1735 1740 1745 

GCG ATC CCG CCA CCA AAT ATC CAC AT" GAC AAC TAC GAT GAA AAT TCC 
Ala He Pro Pro Pro Asn He His He Asp Asn Tyr Asp Glu Asn Ser 

1750 1755 1760 

CTG AGT TTT ACC CTG ACC GTG GAT GGG AAC ATC AAG GTG AAT GGC TAT 
Leu Ser Phe Thr Leu Thr Val Asp Gly Asn He Lys Val Asn Gly Tyr 

1765 17 7 0 1775 

GTG GTG AAC CTT TTC TGG GCA TTT GAC ACC CAC AAA CAA GAG AAG AAA 
Val Val Asn Leu Phe Trp Ala Phe Asp Thr His Lys Gin Glu Lys Lys 
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1780 1785 1790 

ACC ATC AAC TTC CAA GGG AGC TCA G7G TCC CAC AAA GTT GCC AAT CTG 56C4 
Th: Met Asn Phe Gin Gly Ser Ser Val Ser His Lys Val Gly Asn Leu 

:7S5 1800 1805 

ACA GCA CAG ACG GCC "AT GAG ATT TCC GCC TGG GCC AAG ACT CAC TTG 5652 
Thr Ala Gin Thr Ala Tyr Glu He Ser Ala Trp Ala Lys Thr Asp Leu 
1810 1815 1820 1825 

GGC GAT AGT CCT CTG TA TTT GAG CAT GTC ACG ACC AGA GGG GTT CGC 5700 
Gly Asp Ser Pro Lea Ser Phe Glu His Val Thr Thr Arg Gly Val Arg 

1830 1835 1840 

CCA CCT GCT CCT AGC CTC AAG GCC AGG GCT ATC AAT CAG ACT GCA GTG 5748 
Pro Pro Ala Pro Ser Leu Lys Ala Arg Ala He Asn Gin ~hr Ala Val 

1845 1850 1855 

GAA TGC ACC TGG ACA GGC CCC AGG AAT GTG GTG TAT GGC ATT TTC TAT 5796 
Glu Cys Thr Trp Thr Gly Pro Arg Asn Val Val Tyr Gly lie Phe Tyr 

I860 1865 L870 

GCC ACA TCC T~C CTG GAC CTC TAC CGC AAC CCA AGC AGC CTG ACC ACG 5844 
Ala ~hr Ser Phe Leu Asp Leu Tyr Arg Asn Fro Ser Ser Leu Thr Thr 

1375 1880 1385 

CCS CTG CAC AAC GCA ACC GTG CTC GTC GGT AAG GAT GAG CAG TAT CTG 5832 
Pro L.eu His Asn Ala Thr Val Leu Val Gly Lys Asp Glu Gin Tyr Leu 
1830 1895 1900 1905 

"TT CTG GTC CGG GTG GTG ATG CCC TAC CAA GGG CCG TCC TCG GAC TAC 5940 
Phe Leu Val Arg Val Val Met Pro Tyr Gin Gly Pro Ser Ser Asp Tyr 

1910 1915 1920 

CTG GTC GTG AAG ATG ATC CCG GAC AGC AGG CTT CCT CCC CGG CAC CTG 5983 
Val Vai Val Lys Met lie Pre Asp Ser Arg Leu Pro Pro Arg His Leu 
1925 1930 1935 
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CAT GCC GTT CAC ACC GGC AAG ACC TCG GCC GTC ATC AAG TGG GAG TCG 6036 
His Ala Val His Thr Gly Lys Thr Ser Ala Val tie Lys Trp Glu Ser 

1940 1945 1950 

CCC TAC GAC TCT CCT GAC CAG GAC CTG TTC TAT GCG ATC GCA GTT AAA 6084 
Pro Tyr Asp Ser Pro Asp Gin Asp Leu Phe Tyr Ala He Ala Val Lys 

1955 I960 1965 

GAT CTG ATA CGA AAG ACG GAC CGG AGC TAC AAA GTC AAG TCC CGC AAC 6132 
Asp Leu He Arg Lys Thr Asp Arg Ser Tyr Lys Val Lys Ser Arg Asn 
1970 1975 1980 1985 

AGC ACC CTG GAG TAC ACC CTG AGC AAG CTG GAG CCC GGA GGG AAA TAC 6180 
Ser Thr Val Glu Tyr Thr Leu Ser Lys Leu Glu Pro Gly Gly Lys Tyr 

1990 1995 2000 

CAC GTC ATT GTG CAG CTG GGG AAC ATG AGC AAA GAT GCC AGT GTG AAG 6228 
His Val He Val Gin Leu Gly Asn Met Ser Lys Asp Ala Ser Val Lys 

2005 2C10 2015 

ATC ACC ACC GTT TCG TTA TCG GCA CCC GAT GCC TTA AAA ATC ATA ACA 6276 
He Thr Thr Val Ser Leu Ser Ala Pro Asp Ala Leu Lys lie lie Thr 

2020 2025 2030 

GAA AAT GAC CAC GTC CTT CTC TTC TGG AAA AGT CTA GCT CTA AAG GAA 6324 
Glu Asr. Asp His Val Leu Leu Phe Trp Lys Ser Leu Ala Leu Lys Glu 

2035 2040 2045 

AAG TAT TTT AAC GAA AGC AGG GGC TAC GAG ATA CAC ATG TTT GAT AGC 6372 
Lys Tyr Phe Asn Glu Ser Arg Gly Tyr Glu He His Met Phe Asp Ser 
2050 2055 2060 2065 

GCC ATG AAT A~C ACC GCA TAC CTT GGG AAT ACT ACT GAC AAT TTC TTT 6420 
Ala Met Asn He Thr Ala Tyr Leu Gly Asn Thr Thr Asp Asn Phe Phe 

2070 2075 2080 

AAA ATT TCC AAC CTG AAG ATG GGT CAC AAT TAC ACA TTC ACG GTC CAG 6468 
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Lys 


He 


Ser Asn 


Leu 


Lvs 


Met 


Gly 


His 


Asn 


Tyr 


Thr 


Phe 


Thr 


Val 


Gin 








2085 








2090 








2095 






GCA 


CGA 


TGC CTT 


TTG 


GGC 

UVJU- 


AGC 


CAG 


ATC 


TGC 


p,p,n 


Cat. 


CCT 


GCC 


GTG 


PTA 

U i n 


O j L D 


Ala Arg Cys Leu 


Leu 


u i y 


Ser 


Gin 


lie 


Cys 


p ' \J 
u. y 


Glu 


Pro 


Ala 


Val 


Leu 








2100 








2105 








2110 








CTG 


TAT GAT GAG 


CTG 


P.P. P. 


TCT GGT 


GGC 


CAT 


ppp 

uuu 


TCG 


GCG 


ATG 


CAG 


PPT 

lit l 


cere a 


Leu 


Tyr 


Asp Glu 


Leu 


Gly 


Ser 


Gly 


Gl y 


Asp 


Ala 


Ser 


Ala 


Met 


Gi n 


a i a 






2115 






2120 








2125 










GCC 


AGG 


TCT ACT 


GAT 


CTC 


GCC 


GCC 


GTG 


GTG 


GTG 


CCC 


ATC 


CTG 


TTT 


L lb 


coin 

bb 1 l 


Ala 


Arg 


Ser Thr 


Asp 


Val 


Ala 


Ala 


Va; 


Val 


Val 


Pro 


He 


Leu 


Phe 


Leu 




2130 






2135 








2140 








2145 




MA 


CTG 


CTG AGC 


CTG 


CGG 


GTC 


GGG 


TTT 


GCC 


ATC 


CTG 


TAC 


ACG 


AAG 


CAT 


6660 


He 


Leu 


Leu Ser 


Leu 


Gly 


Va! 


Gly 


Phe 


Ala 


He 


Leu 


Tyr 


Thr 


lys 


His 










2150 








2155 








2160 




CGG 


AGG 


CTG CAG 


AGC 


AGC 


TTC 


ACC 


GCC 


TTC 


g:c 


AAC 


AGC 


CAC 


TAC 


ACC 


6708 


Arg 


Arg 


Leu Gin 


Ser 


Ser 


Phe 


Thr 


Ala 


Phe 


Ala Asn 


Ser 


His 


Tyr 


Ser 








2165 








2170 








2175 






TCC 


AG A 


CTC GGC 


TCC 


GCC 


ATC 


TTC 


1 ^u 


m CT 


GGG 


GAT GAC 


TTC- 


GGG 


GAG 




Se- 


Arg 


Leu Gly Ser 


Ala 


lie 


Phe 


Ser 


Ser 


Gly 


Asp 


As? 


Leu 


Gly Glu 








2180 








2185 








2190 








^ -r> 

i J ."A : 


GAT 


GAA GAT 


GCT 


CCT 


ATG 


ATC 


ACT 


uuA 




TCG 


GAC 


GAC 


GTC 


CCC 


5304 


Asp 


Asp 


Glu Asp 


Aia 


Pro 


Met 


Me 


Thr 


Gly 


Phe 


Ser 


Asp 


Asp 


Val 


Pro 





2195 2200 2205 

ATC GTG ATA GCC TGAAAGAGCT TTCCTCACTA GAAACCAAAT GGTGTAAATA 6856 
Met Val He Ala 

2210 

TTTTATTGA TAAAGATAGT ^GATGGTTTA TTTTAAAAGA TGCACTTTGA GTTGCAATAT 6916 
GTTATTTTTA TATGGGCCAA AAACAAAAGC AAAAAAAAAA AAAAA 6961 
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Sequence ID No. 4 

Len< r t h o f the Sequence : 300 

c 

Type : nucleic acid 
Strandedness : double 
to Topology : linear 

Molecular type: cDNA to mRNA 
Sequence : 

15 ATATCCACAT TGACAGCTAT GGTGAAAATT ATCTAAGCTT CACCCTGACC ATGGAGAGTG 60 

ATATCAAGCT GAATGGCTAT GTGGTGAACC TTTTCTGGGC ATTTGACACC CACAAGCAAG 120 
AGAGGAGAAC TTTGAACTTC CGAGGAAGCA TA^GTCACA CAAAGTTGGC AATCTGACAG 180 

20 

CTCATACATC CTATGAGATT TCTGCCTGGG CCAAGACTGA CTTGGGGGAT AGCCCTCTGG 240 
CATTTGAGCA TGTTATGACC AGAGGGGTTC GCCCACCTGC ACCTAGCCTC AAGGCCAAAG 300 
25 Sequence ID No. 5 

Length of the Sequence: 6642 

Type: nucleic acid 

Strandedness : double 

Topology : linear 

Molecular type: cDNA to mRNA 

Sequence : 

ATGGCGACAC GGAGCAGCAG GAGGGAGTCG CGACTCCCGT TCCTATTCAC CCTGGTCGCA 63 
« :TGC?GCCGC CCGGAGCTCT CTGCGAAGTC TGGACGCAGA GGCTGCACGG CGGCAGCGCG 123 

CCCTTGCCCC AGGACCGGGG CTTCCTCGTG GTGCAGGGCG ACCCGCGCGA GCTGCGGCTG 133 
TGGGCGCGCG GGGATGCCAG GGGGGCGAGC CGCGCGGACG AGAAGCCGCT CCGGAGGAAA 243 
CGGAGCGCTG CCCTGCAGCC CGAGCCCATC AAGGTGTACG GACAGGTTAG TCTGAATGAT 303 
TCCCACAATC AGATGGTGGT GCACTGGGCT GGAGAGAAAA GCAACGTGAT CGTGGCCTTG 363 
GCCCGAGATA GCCTGGCATT GGCGAGGCCC AAGAGCAGTG ATGTGTACGT GTCTTACGAC 423 
TATGGAAAAT CATTCAAGAA AATTTCAGAC AAGTTAAACT TTGGCTTGGG AAATAGGAGT 480 
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GAAGCTGTTA 


TCGCCCAGTT CTACCACAGC CCTGCGGACA 


ACAAGCGGTA 


CATCTTTGCA 


540 




GACGCTTATG 


CCCAGTACCT CTGGATCACG TTTGACTrCT 


GCAACACTCT 


TCAAGGCTTT 


600 


5 


TCCATCCCAT 


TTCGGGCAGC TGATCTCCTC CTACACAGTA 


AGGCCTCCAA 


CCTTCTCTTG 


660 




CGCTTTGACA 


GGTGCCACCC CAACAAGCAG CTGTGGAAGT 


CAGATGACTT 


TGGCCAGACC 


720 


10 


TCGATCATCA 


TTCAGGAACA TGTCAACTCC 'TTTCTTGGG 


GAATTGATCC 


CTATGACAAA 


730 




CCAAATACCA 


TCTACATTGA ACGACACGAA 


CCCTCTGGCT 


ACTCCACTGT 


CTTCCGAAGT 


840 




ACAGATTTCT 


TCCAGTCCCG GGAAAACCAG GAAGTGATCC 


TTGAGGAAGT 


GAGAGATTTT 


900 


15 


CAGCTTCGGG 


ACAAGTACAT GTTTGCTACA 


AAGGTGGTGC 


ATCTCTTGGG 


CAGTGAACAG 


960 




CAGTCTTCTG 


TCCAGCTCTG GGTCTCCTTT GGCCGGAAGC 


CCATGAGAGC 


AGCCGAGTTT 


IC20 




GTCACAAGAC 


ATCCTATTAA TGAATATTAC 


ATCGCAGATG 


CCTCCGAGGA 


CCaGGTCTTT 


108C 


20 


GTGTGTGTCA 


GCCACAGTAA CAACCGCAGC 


AATTTATACA 


TCTCAGAGGC 


AGAGGGGCTG 


114G 




AAGTTCTCCC 


TGTCCTTGCA GAACGTGCTC 


TATTACAGCC 


CAGGAGGGGC 


CGGCAGTGAC 


1200 


25 


ACCTTGG T GA 


GGTATTTTGC AAATGAACCA 


TTTGCTGACT 


TCCACCGAGT 


GGAAGGATTG 


126C 




CAAGGAGTCT 


ACATTGCTAC TCTGATTAAT 


GGTTCTATGA 


ATGAGGAGAA 


CATGAGATCG 


132C 




GTCATCACCT 


TTGACAAAGG GGGAACCTGG 


GAGTTTCTTC 


AGGCTCCAGC 


CTTCACGGGA 


138C 


30 


TATGGAGAGA 


AAATCAATTG TGAGCTTTCC 


CAGGGCTGTT 


CCCTTCATCT 


GGCTCAGCGC 


144C 




CTCAGTCAGC 


TCCTCAACCT CCAGCTCCGG 


AGAATGCCCA 


TCCTGTCCAA 


GGAGTCGGCT 


ioOC 




CCAGCCCTCA 


TCATCGCCAC TGGCTCAGTG 


GGAAAGAACT 


TGGCTAGCAA 


GACAAACGTG 


156C 




TACATCTCTA 


GCAGTGCTGG AGCCAGGTGG 


CGAGAGGCAC 


TTCCTGGACC 


TCACTACTAC 


162C 




ACATGGGGAG 


ACCACGGCGG AATCATCAGG 


GCCATTGCCC 


AGGGCATGGA 


AACCAACGAG 


163C 


40 


CTAAAATACA 


GTACCAATGA AGGGGAGACC 


TGGAAAACAT 


TCATCTTCTC 


TGAGAAGCCA 


174C 




GTGTTTG'GT 


ATGGCCTCCT CACAGAACCT GGGGAGAAGA 


GCACTG~CTT 


CACCATCTTT 


1800 




GGCTCGAACA 


AAGAGAATGT CCACAGCTGG 


CTGATCGTCC 


AGGTCAATGC 


CACGGATGCC 


1860 


45 


TTGGGAGTTC 


CCTGCACAGA GAA"GACTAC 


AAGCTGTGGT 


CACCATCTGA 


TGAGCGCGGG 


1920 




AATGAGTGTT 


TGCTGGGACA CAAGACTGTT TTCAAACGGC 


GGACGCCCCA 


TGGCACATGC 


1980 




TTCAATGGAG 


AGGACTTTGA CAGGCCGGTG 


GTCGTGTCCA 


ACTGGTCCTG 


CACCCGGGAG 


2040 




GACTATGAGT 


GTGACTTCGG TTTCAAGATG 


AGTGAAGATT 


TGTCATTAGA 


GGTTTGTGTT 


2100 




CCAGATCCGG 


AATTTTCTGG AAAGTCATAC 


TCCCCTCCTG 


TGCCTTGCCC 


TGTGGGTTCT 


2160 
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ACTTACAGGA GAACGAGAGG CTACCGGAAG ATTTCTGGGG ACACTTGTAG CGGAGGAGAT 2220 
GTTGAAGCGC GACTGGAAGG AGAGCTGGTC CCCTGTCCCC TGGCAGAAGA GAACGAGTTC 2280 
ATTCTGTATG CTGTGAGGAA ATCCATCTAC CGCTATGACC TGGCCTCGGG AGCCACCGAG 2340 
CAGTTGCCTC TCACCGGCCT ACGGGCAGCA GTGGCCCTGG ACTTTGACTA TGAGCACAAC 2400 
TGTTTGTATT GGTCCGACCT GGCCTTGGAC GTCATCCAGC GCCTCTGTTT GAATGGAAGC 2460 
ACAGGGCAAG AGGTGATCAT CAATTCTGGC CTGGAGACAG TAGAAGCTTT GGCTTTTGAA 2520 
CCCCTCAGCC AGCTGCTTTA CTGGGTAGAT GCAGGCTTCA AAAAGATTGA GGTAGCTAAT 2580 
CCAGATGGCG ACTTCCGACT CACAATCGTC AATTCCTCTG TGCTTGATCG TCCCAGGGCT 2640 
CTGGTCCTCG TGCCCCAAGA GGGGGTGATG TTCTGGACAG ACTGGCGAGA CCTGAAGCCT 2700 
GGGATTTATC GGAGCAATAT GGATGGTTCT GCTGCCTATC ACCTGGTGTC TGAGGATGTG 2760 
AAGTGGCCCA ATGGCATCTC TGTGGACGAC CAGTGGATTT ACTGGACGGA TGCCTACCTG 2820 
GAGTGCATAG AGCGGATCAC GTTCAGTGGC CAGCAGCGCT CTGTCATTCT GGACAACCTC 2880 
CCGCACCCCT ATGCCATTGC TGTCTTTAAG AATGAAATCT ACTGGGATGA CTGGTCACAG 2340 
CTCAGCATAT TCCGAGCTTC CAAATACAGT GGGTCCCAGA TGGAGATTCT GGCAAACCAG 3000 
CTCACGGGGC TCATGGACAT GAAGATTTTC TACAAGGGGA AGAACACTGG AAGCAATGCC 3060 
TGTGTGCCCA GGCCATGCAG CCTGCTGTGC C7GCCCAAGG CCAACAACAG TAGAAGCTGC 3120 
AGGTGTCCAG AGGATGTGTC CAGCAGTGTG C7TCCATCAG GGGACCTGAT GTGTGACTGC 3180 
CCTCAGGGCT ATCAGCTCAA GAACAATACC TGTGTCAAAG AAGAGAACAC CTGTCTTCGC 3240 
AACCAGTATC GCTGCAGCAA CGGGAACTGT ATCAACAGCA TTTGGTGGTG TGACTTTGAC 3300 
AACGACTGTG GAGACATGAG CGATGAGAGA AACTGCCCTA CCACCATCTG TGACCTGGAC 3360 
ACCCAGmC GTTGCCAGGA GTCTGGGACT TGTATCCCAC TGTCCTATAA ATGTGACCTT 3420 
GAGGATGACT GTGGAGACAA CAGTGATGAA AGTCATTGTG AAATGCACCA GTGCCGGAGT 3480 
GACGAGTACA ACTGCAGTTC CGGCATGTGC ATCCGCTCCT CCTGGGTATG TGACGGGGAC 3540 
AACGACTGCA GGGACTGGTC TGATGAAGCC AACTGTACCG CCATCTATCA CACCTGTGAG 3600 
GCCTCCAACT TCCAGTGCCG AAACGGGCAC TGCATCCCCC AGCGGTGGGC GTGTGACGGG 3660 
GATACGGACT GCCAGGATGG TTCCGATGAG GATCCAGTCA ACTGTGAGAA GAAGTGCAAT 3720 
GGATTCCGCT GCCCAAACGG CACTTGCATC CCATCCAGCA AACATTGTGA TGGTCTGCGT 3780 
GATTGCTCTG ATGGCTCCGA TGAACAGCAC TGCGAGCCCC TCTGTACGCA CTTCATGGAC 3840 
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TTTGTGTGTA AGAACCGCCA GCAGTCCCTG 
CAGTGCCGCG ACCGGTCCGA TGAGGATGGG 
TTCGACAAGG TATGTGATGA GTTCGGTTTC 
ATTTGGAAGT GCGACGGGAT GGATGATTGC 
AACCCCACAG AAGCCCCAAA CTGCTCCCGC 
TGCATCCCCA ACAGATGGAA ATGTGACAGG 
AAGGATTGTG GAGATTCACA TATTCTTCCC 
CCCAATTACT ACCGCTGCAG CAGTGGGACC 
TACCGAGATT GTGCAGATGG CTCTGACGAG 
GCTGCCTCCA CTCCCACCCA ACTTCCGCCA 
CCGAAGACGT GTATTCCCAA CTGGAAGCGC 
CGGGACGAGG CCAATTGCCC CACACACA3C 
TGCGAGGACG GGGAGGCCTG CATTGTGCTC 
TCGGACGAGA GCGATGAAAA GGCCTGCAGT 
CTTCAGTGGA CAGCTGACTT CTCTGGGGAT 
ATGCCCTCTG CATCTTGTGT ATATAATGTC 
AAGACTCTGG AGACCCACAG CAATAACACA 
ACCACGTATC AGGTTAAAGT ACAGGTTCAG 
TTTGTGACCC TGACGACCCC AGAGGGATTG 
CTCCCCAGGG AACCAGAAGG TGTGATTGTA 
GGCCTCATCC GTGAGTACAT TGTAGAATAC 
CAGAGGGCTG CTAGTAACTT TACAGAAATC 
GTCAGAGTGG CTGCGGTGAC TACTCGTGGA 
ACCACGATAA AAGGAAAAGT GATCCCACCA 
AATTATCTAA GCTTCACCCT GACCATGGAG 
AACCTTTTCT GGGCATTTGA CACCCACAAG 
AGCATATTG? CACACAAAGT TGGCAATCTG 
TGGGCCAAGA CTGACTTGGG GGATAGCCCT 



TTCCACTCCA 


TGGTCTGTGA 


CCGAATCATC 


3900 


GCGTTTGCAG 


GATGCTCCCA 


ACATCCTGAG 


2960 


CAGTGTCAGA 


ATGGAGTGTG 


GATCACTTTG 


4020 


GGCGATTATT 


CTGATGAAGC 


CAACTGCGAA 


4080 


TACTTCCAGT 


TTCGGTGTGA 


GAATGGCCAC 


4140 


GAGAACGACT 


GTGGGGACTG 


GTCTGATGAG 


4200 


TTCTCGAC7C 


CTGGGCCCTC 


CACGTGTCTG 


4260 


TGCGTGATGG 


ACACCTGGGT 


GTGCGACGGG 


4320 


GAAGCCTGCC 


CCTTGCTTGC 


AAACGTCACT 


4380 


TGTGACCGAT 


TTGAGTTCGA 


ATGCCACCAA 


4440 


TGTGACGGCC 


ACCAAGATTG 


CCAGGATGGG 


4500 


ACCTTGACTT 


GCATGAGCAG 


GGAGTTCCAG 


4560 


TCGGAGCGCT 


GCGACGGCTT 


GCTCGACTGC 


4620 


GATGAGTTGA 


CTGTGTACAA 


AGTACAGAAT 


4680 


GTGACTTTGA 


CCTGGATGAG 


GCCCAAAAAA 


4740 


TACTACAGGG 


TGGTTGGAGA 


GAGCATATGG 


4800 


AACACTGTAT 


TAAAACXTT 


GAAACCAGAT 


4860 


TGTCTCAGCA 


AGGCACACAA 


CACCAATGAG 


4920 


CCAGATGCCC 


CTCGAAATCT 


CCAGCTGTCA 


4980 


GGCCACTGGG 


CTCCTCCCAT 


CCACACCCAT 


5040 


AGCAGGAGTG 


GTTCCAAGAT 


GTGGGCCTCC 


5100 


AAGAACTTAT 


TGGTCAACAC 


TCTATACACC 


5160 


ATAGGAAACT 


GGAGCGATTC 


TAAATCCATT 


5220 


CCAGATATCC 


ACATTGACAG 


CTATGGTGAA 


5280 


AGTGATATCA 


AGGTGAATCG 


CTATGTGGTG 


534C 


CAAGAGAGGA 


GAACTTTGAA 


CTTCCGAGGA 


540C 


ACAGCTCATA 


CATCCTATGA 


GATTTCTGCC 


5460 


CTGGCATTTG 


AGCATGTTAT 


GACCAGAGGG 


5520 
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GTTCGCCCAC 


CTGCACCTAG 


CCTCAAGGCC AAAGCCATCA 


ACCAGACTGC AGTGGAATGT 


5580 


a:ctggaccg 


GCCCCCGGAA 


TG7GGTTTAT GGTATTTTCT 


ATGCGACGTC CTTTCTTGAC 


5640 


CTCTATCGCA 


ACCCGAAGAG 


CTTGACTACT TCACTCCACA 


ACAAGACGGT CATTGTCAGT 


5700 


AAGGA~CAGC 


AGTATTTGTT 


TCTGGTCCGT GTAGTGGTAC 


CCTACCAGGG GCCATCCTCT 


5760 


GACTACGTTG 


TAGTGAAGAT 


GATCCCGGAC AGCAGGCTTC 


CACCCCGTCA CCTGCATGTG 


5820 


GTTCATACGG 


GCAAAACCTC 


CGTGGTCATC 


AAGTGGGAAT 


CACCGTATGA CTCTCCTGAC 


5880 


CAGGACTTGT 


TGTATGCAAT 


TGCAGTCAAA 


GATCTCATAA 


GAAAGACTGA CAGGAGCTAC 


5940 


AAAGTAAAAT 


CCCGTAACAG 


CACTGTGGAA 


TACACCCTTA 


ACAAGTTGGA GCCTGGCGGG 


6000 


AAATACCACA 


TCATTGTCCA 


ACTGGGGAAC 


ATGAGCAAAG 


ATTCCAGCAT AAAAATTACC 


6060 


ACAGTTTCAT 


TATCAGCACC 


TGATGCCTTA 


AAAATCATAA 


CAGAAAATGA TCATGTTCTT 


6120 


CTGTTTTGGA 


AAAGCCTGGC 


TTTAAAGGAA 


AAGCATTT7A 


ATGAAAGCAG GGGC T ATGAG 


6180 


ATACACATGT 


TTGATAGTGC 


CATGAATATC 


ACAGCTTACC 


TTGGGAATAC TAGTGACAAT 


6240 


TTTTTAAAA 


rrccAACCT 


GAAGATGGGT 


CATAATTACA 


CGTTCACCGT CCAAGCAAGA 


6300 


TGCCTTTTTG 


GCAACCAGAT 


CTGTGGGGAG 


CCTGCCATCC 


TGCTGTACGA TGAGCTGGGG 


636C 


TCTCGTGCXG 


ATGCATCTGC 


AACGCAGGCT GCCAGATCTA 


CGGATGTTGC TGGTGTGGTG 


6420 


G~GCCCAT^T 


TATTCCTGAT 


ACTGCTGAGC 


CTGGGGGTGG 


GGTTTGCCAT CCTGTACACG 


6480 


AAGuAoCGuA 


GGCTGCAGAG 


CAGCTTCACC 


GCGTTCGCCA 


ACAGCCACTA CAGCTCCAGG 


6540 


C . GuGGToLG 


CAATCTTCTC 


CTCTGGGGAT GACCTGGGGG 


AAGATGATGA AGATGCCCCT 


6600 


ATGATAACTG 


GATTTTGAGA 


TGACGTCCCC 


ATGGTGATAG 


:c 


6642 


Sequence 


ID No. 6 











Length of the Sequence: 2214 
Type : amino acid 
Topology: linear 
Molecular type: Protein 
Sequence : 

Wet Ala Thr Arg Ser Ser Arg Arg Glu Ser Arg Leu Pro Phe Leu Phe 

5 10 15 

Thr Leu Val Ala Leu Leu Pro Pro Gly Ala Leu Cys Glu Val Trp Thr 
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20 25 30 

Gin Arg Leu His Gly Cly Ser Ala Pro Leu Pro Gin Asp Arg GI; Phe 

35 40 45 

Leu Val Val Gin Gly Asp Pro Arg Glu Leu Arg Leu Trp Ala Arg Gly 

50 55 60 

Asp Ala Arg Gly Ala Ser Arg Ala Asp GIu Lys Pro Leu Arg Arg Lys 
65 70 75 80 

Arg Ser Ala Ala Leu Gin Pro Glu Pro lie Lys Val Tyr Gly Gin Val 

35 90 95 

Ser Leu Asn Asp Ser His *sn Gin Met Val Val His Trp Ala Gly Glu 

100 105 110 

Lys Ser Asn Val lie Val Ala Leu Ala Arg Asp Ser Leu Ala Leu Ala 

115 120 125 

Arg Pro Lys Ser Ser Asp Val Tyr Val Ser Tyr Asp Tyr Gly Lys Ser 

130 135 140 

Phe Lys Lys lie Ser Asp Lys Leu Asn Phe Gly Leu Gly Asn Ar? Ser 
145 150 155 ;60 

Glu Ala Val He Ala Gin Phe Tyr His Ser Pro Ala Asp Asn Lys Arg 

165 170 175 

Tyr lie Phe Ala Asp Aia Tyr Ala Gin Tyr Leu Trp He Thr Phe Asp 

:8C 135 190 

Phe Cys Asn Thr Leu Gin Gly Phe Ser !!e Pro Phe Arg Ala Ala Asp 

195 200 205 

Leu Leu Leu His Ser Lys Ala Ser Asn Leu Leu Leu Gly Phe Asp Arg 

210 215 220 

Ser His Pro Asn Lys Gin Leu Trp Lys Ser Asp Asp Phe Gly Gin Thr 
225 230 235 240 

Trp tie Met He Gin Glu His Val Lys Ser Phe Ser Trp Gly He Asp 
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245 250 255 

Pro Tyr Asp Lys Pro Asn Thr He Tyr He Glu Arg Hts Glu Pro Ser 

nrn OC£ 970 

Gly Tyr Ser Thr Val Phe Arg Ser Thr Asp Phe Phe G'.n Ser Arg Glu 

275 280 285 

Asn Gin Glu Val He Leu Glu Glu Val Arg Asp Phe G'.n Leu Arg Asp 

290 295 300 

Lys Tyr Met Phe Ala Thr Lys Val Val His Leu Leu Gly Ser Glu Gin 
305 310 315 320 

Gin Ser Ser Val Gin Leu Trp Val Ser Phe Gly Arg Lys Pro Met Arg 

325 330 335 

Ala Ala Gin Phe Val Thr Arg His Pro He Asn Glu Tyr Tyr lie Ala 

340 345 350 

Asp Ala Ser Glu Asp Gin Val Phe Val Cys Val Ser H;s Ser Asn Asn 

355 360 365 

Arg Thr Asn Leu Tyr lie Ser Glu Ala Glu Gly Leu Lys Phe Ser Leu 

-0 375 380 

Ser Leu Glu Asn Val Leu Tyr Tyr Ser Pro Gly Gly Ala Gly Ser Asp 
285 3S0 395 400 

Thr Leu Vai Arg Tyr Phe Ala Asn Glu Pro Phe Ala Asp Phe His Arg 

405 410 415 

Val Glu Gly Leu Gin Gly Val Tyr lie Ala Thr Leu He Asn Gly Ser 

420 425 430 

Met Asn Glu Glu Asn Met Arg Ser Val lie Thr Phe Asp Lys Gly Gly 

435 440 445 

Thr Trp Glu Phe Leu Gin Ala Pro Ala Phe Thr Gly Tyr Gly Glu Lys 

450 455 460 

lie Asn Cys Glu Leu Ser Gin Gly Cys Ser Leu His Leu Ala Gin Arg 
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465 470 475 480 

Leu Ser Gin Leu Leu Asn Leu Gin Leu Arg Arg Met Pro lie Leu Ser 

485 490 495 

Lys Glu Ser Ala Pro Gly Leu He He Ala Thr Gly Ser Val Gly Lys 

500 505 510 

Asn Leu Ala Ser Lys Thr Asn Val Tyr He Ser Ser Ser Ala Gly Ala 

515 520 525 

Arg Trp Arg Glu Aia Leu Pro Gly Pre His ~yr Tyr Thr Trp Gly Asp 

530 535 540 

His Gly Gly He He Thr Ala He Ala Gin Gly Met Glu Thr Asn Glu 
545 550 555 550 

Leu Lys Tyr Ser Thr Asn Glu Gly Glu Thr Trp Lys Thr Phe He Phe 

565 570 575 

Ser Glu Lys Pro Val Phe Val Tyr Gly Leu Leu Thr Glu Pre Gly Glu 

530 585 590 

Lys Ser Thr Val Phe Thr He Phe Gly Ser Asn Lys Glu Asn Val His 

595 600 605 

Ser Trp Leu He Leu Gin Val Asn Ala Thr Asp Aia Leu Gly Val Pro 

610 615 620 

Cys Thr Glu Asn Asp Tyr Lys Leu Trp Ser Pro Ser Asp Glu Arg Gly 
625 630 635 640 

Asn Glu Cys Leu Leu Gly His Lys Thr Val Phe Lys Arg Arg ~hr Pro 

645 650 655 

His Ala Thr Cys Phe Asn Gly Glu Asp Phe Asp Arg Pro Val Val Val 

660 665 670 

Ser Asn Cys Ser Cys Thr Arg Glu Asp Tyr Glu Cys Asp Phe Gly Phe 

675 680 685 

Lys Met Ser Glu Asp Leu Ser Leu Glu Val Cys Val Pro Asp Pro Glu 
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690 695 700 

Phe Ser Gly Lys Ser Tyr Ser Pro Pro Val Pro Cys Pro Val Gly Ser 

7Q5 710 715 720 

Thr Tyr Arg Arg Thr Arg Gly Tyr Arg Lys He Ser Gly Asp Thr Cys 

725 730 "35 

Ser Gly Gly Asp Val Glu Ala Arg Leu Glu Gly Glu Leu Val Pro Cys 

740 745 750 

Pro Leu Ala Glu Glu Asn Glu Phe lie Leu Tyr Ala Val Arg Lys Ser 

755 760 765 

lie Tyr Arg Tyr Asp Leu Ala Ser Gly Ala Thr Glu Gin Leu Pro Leu 

770 775 780 

Thr Gly Leu Arg Ala Ala Val Ala Leu Asp Phe Asp Tyr Glu His Asn 

785 790 795 800 

Cys Leu Tyr Trp Ser Asp Leu Ala Leu Asp Val lie Gin Arg Leu Cys 

805 810 815 

Leu Asn Gly Ser Thr Gly Gin Glu Val He He Asn Ser Gly Leu Glu 

820 825 830 

Thr Val Glu Ala Leu Ala Phe Glu Pro Leu Ser Cln Leu Leu Tyr Trp 

835 840 845 

Val Asp Ala Gly Phe Lys Lys lie Glu Val Ala Asn Pro Asp Gly Asp 

850 855 860 

Phe Arg Leu Thr lie Val Asn Ser Ser Val Leu Asp Arg Pro Arg Ala 

2-n 875 880 

865 8i0 0,3 

Leu Val Leu Val Pro Gin Glu Gly Val Met Phe Trp Thr Asp Trp Gly 
885 890 895 



Asp Leu Lys Pro Gly He Tyr Arg Ser Asn Met Asp Gly Ser Ala Ala 

900 905 910 

Tyr His Leu Val Ser Glu Asp Val Lys Trp Pro Asn Gly He Ser Val 
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915 920 925 

Asp Asp Gin Trp He Tyr Trp Thr Asp Ala Ty: Leu Glu Cys He Glu 

930 935 940 

Arg He Thr Phe Ser Gly Gin Gin Arg Ser Val He Leu Asp Asn Leu 
945 950 955 960 

Pro His Pro Tyr Ala tie Ala Val Phe Lys Asn Giu He Tyr Trp Asp 

965 970 975 

Asp Trp Ser Gin Leu Ser lie Phe Arg Ala Se- Lys Tyr Ser G!y Ser 

980 985 990 

Gin Met Glu He Leu Ala Asn Gin Leu Thr Gly Leu Met Asp Me: Lys 

995 1000 1005 

He Phe Tyr Lys G!y Lys Asn Thr Gly Ser Asn Ala Cys Val Pro Arg 

1010 1015 1020 

Pro Cys Ser Leu Leu Cys Leu Pro Lys Ala Asn Asn Ser Arg Ser Cys 
1025 1030 1035 '040 

Arg Cys Pro Glu Asp Val Ser Ser Ser Val Leu Pro Ser Gly Asp Leu 

1045 1050 1055 

Met Cys Asp Cys Pro Gin Gly Tyr Gin Leu Lys Asn Asn Thr Cys Val 

1060 1065 1C70 

Lys Glu Glu Asn Thr Cys Leu Arg Asn Gin Tyr Arg Cys Ser Asn Gly 

1075 1080 1085 

Asn Cys lie Asn Ser lie Trp Trp Cys Asp Phe Asp Asn Asp Cys Gly 

1090 1095 1100 

Asp Met Ser Asp Glu Arg Asn Cys Pro Thr Thr He Cys Asp Leu Asp 
i:05 1110 1115 1120 

Thr Gin Phe Arg Cys Gin Glu Ser Gly Thr Cys He Pro Leu Ser Tyr 

1125 1130 1135 

Lys Cys Asp Leu G1j Asp Asp Cys Gly Asp Asn Ser Asp Glu Ser His 
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L 140 H45 1150 

Cys Glu Met His Gin Cys Arg Ser Asp Glu Tyr Asn Cys Se^ Ser Giy 

1155 1160 H65 

Met Cys lie Arg Ser Ser Trp Val Cys Asp Gly Asp Asn Asp Cys Arg 

H70 1 175 U80 

Asp Trp Ser Asp Glu Ala Asn Cys Thr Ala He Tyr His Thr Cys Glu 

1185 H90 1200 

Ma Ser Asn Phe Gin Cys Arg Asn Gly His Cys lie Pro Gin Arg Trp 

1205 1210 1215 

Ala Cys Asp Gly Asp Thr Asp Cys Gin Asp Gly Ser Asp Glu Asp Pro 

1220 1225 1230 

Val Asn Cys Glu Lys Lys Cys Asn Gly Phe Arg Cys Pro Asn Gly Thr 

1235 1240 1245 

Cys lie Pro Ser Ser Lys His Cys Asp Gly Leu Arg Asp Cys Ser Asp 

1250 1255 1260 

Gly Ser Asp Glu Gin His Cys Glu Pro Leu Cys Thr His Phe Met Asp 
,265 1270 1275 1280 

Phe Val Cys Lys Asn Arg Gin Gin Cys Leu Phe His Ser Met Val Cys 

1235 1290 1295 

Asp Gly He lie Gin Cys Arg Asp Gly Ser Asp Glu Asp Ala Ala Phe 

1300 1305 1310 

Ala Gly Cys Ser Gin Asp Pro Glu Phe His Lys Val Cys Asp Glu Phe 

1315 1320 1325 

Gly Phe Gin Cys Gin Asn Gly Val Cys lie Ser Leu Me Trp Lys Cys 

1330 1335 1340 

Asp Gly Met Asp Asp Cys Gly Asp Tyr Ser Asp Glu Ala Asn Cys Glu 
1345 1350 1355 1360 

Asn Pro Thr Glu Ala Pro Asn Cys Ser Arg Tyr Phe Gin Phe Arg Cys 
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U 



1365 1370 1375 

Glu Asn Gly His Cys He Pro Asn Arg Trp Lys Cys Asp Ar/ 31 u Asn 

1380 1385 13S0 

Asp Cys Gly Asp Trp Ser Asp CI u Lys Asp Cys Gly Asp Ser His lie 

1395 1400 [405 

Leu Pro Phe Ser Thr Pro Gly Pro Ser Thr Cys Leu Pro Asn Tyr Tyr 
1410 1415 1420 

is Arg Cys Ser Ser Giy Thr Cys Val Met Asp Thr Trp Vai Cys Asp Gly 

1425 1430 1435 1443 

Tyr Arg Asp Cys Ala Asp Gly Ser Asp Glu Glu Ala Cys Pro Leu Leu 

1445 1450 1455 

Ala Asn Val Thr Ala Ala Ser Thr Pro Thr Gin Leu Gly Arg Cys Asp 
1460 1465 1470 

25 

Arg Phe Glu Phe Glu Cys His Gin Pro Lys Thr Cys He Pro Asn Trp 

1475 1480 1485 

Lys Arg Cys Asp Gly His Gin Asp Cys Gin Asp Gly Arg ^sp Glu Aia 

1490 1495 1500 

Asn Cys Pro Thr His Ser Thr Leu Thr Cys Met Ser Arg Glu Phe Gin 
1505 1510 1515 1520 

Cys Glu Asp Gly Glu Ala Cys lie Val Leu Ser Glu Arg Cys Asp Gly 

1525 1530 1535 

Phe Leu Asp Cys Ser Asp Glu Ser Asp Glu Lys Ala Cys Ser Asp Giu 
1540 1545 1550 

45 Leu Thr Val Tyr Lys Val Gin Asn Leu Gin Trp Thr Ala Asp Phe Ser 

1555 1560 1565 

Gly Asp Val Thr Leu Thr Trp Met Arg Pro Lys Lys Met Pro Ser Ala 
50 1570 1575 1580 

Ser Cys Val Tyr Asn Val Tyr Tyr Arg Val Val Gly Glu Ser He Trp 
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1585 L590 1595 1600 

Lys Thr Leu GIu Thr His Ser Asn Lys Thr Asn Thr Val Leu Lys Val 

1505 1610 1615 

Leu Lys Pro Asp Thr Thr Tyr Gin Val Lys Vai Gin Val Gin Cys Leu 

1620 1625 1630 

Ser Lys Ala His Asn Thr Asn Asp Phe Val Thr Leu Arg Thr Pro Glu 

1635 1640 1645 

Gly Leu Pro Asp Ala Pro Arg Asn Leu Gin Leu Ser Leu Pro Arg Glu 

1650 1655 1660 

Ala Glu Gly Val lie Val Gly His Trp Ala Pro Pro He His Thr His 
1665 1670 1675 1680 

Gly Leu lie Arg Glu Tyr He Val Glu Tyr Ser Arg Ser Gly Ser Lys 

1685 1690 1695 

Met Trp Ala Ser Gin Arg Ala Ala Ser Asn Phe Thr Glu He Lys Asn 

1700 1705 1710 

Leu Leu Val Asn Thr Leu Tyr Thr Val Arg Val Ala Ala Val Thr Ser 

1715 1720 1725 

Arg Gly He Gly Asn Trp Ser Asp Ser Lys Ser He Thr Thr He Lys 

1730 1735 1740 

Gly Lys Val He Pro Pro Pro Asp He His He Asp Ser Ty: Gly Glu 
1745 1750 1755 1760 

Asn Tyr Leu Ser Phe Thr Leu Thr Met Glu Ser Asp He Lys Val Asn 

1765 1770 :775 

Gly Tyr Val Val Asn Leu Phe Trp Ala Phe Asp Thr His Lys Gin Glu 

1780 1735 1790 

Arg Ar? ~hr Leu Asn Phe Arg Gly Ser He Leu Ser His Lys Val Gly 

17S5 1800 1805 

Asn Leu Thr Ala His Thr Ser Tyr Glu He Ser Ala Trp Ala Lys Thr 
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1810 1815 1820 

Asp Lea Gly Asp Ser Pro Leu Ala Phe GIu His Val Met Thr Arg CU 
182 5 1830 1835 ;340 

Val Arg Pro Pro Ala Pro Ser Leu Lys Ala Lys Ala lie Asn G!n Thr 

1845 1850 1855 

Ala Val GIu Cys Thr Trp Thr Gly Pro Arg Asa Val Val Tyr Gly He 

1860 1865 1870 

Phe Tyr Ala Thr Ser Phe Leu Asp Leu Tyr Arg Asn Pro Lys Ser Leu 

1875 1880 1885 

Thr Thr Ser Leu His Asn Lys Thr Val Me Val Ser Lys Asp GIu C!n 

1890 1895 1900 

Tyr Leu Phe Leu Val Arg Val Val Val Pro Tyr Gin Gly Pro Ser Ser 
1905 1910 1915 1920 

Asp Tyr Val Vai Val Lys Met He Pro Asp Ser Arg Leu Pro ? ro Arg 

1925 1930 1935 

His Leu His Val Val His Thr Gly Lys Thr Ser Val Val He Lys Trp 

1940 1945 1950 

GIu Ser Pro Tyr Asp Ser Pro Asp Gin Asp Leu Leu Tyr Ala He Ala 

1955 1960 1965 

Val Lys Asp Leu lie Arg Lys Thr Asp Arg Ser Tyr Lys Val Lys Ser 

1970 1975 1980 

Arg Asn Ser Thr Val GIu Tyr Thr Leu Asn Lys Leu GIu Pro Gi> Gly 
1935 1990 1995 2000 

Lys Tyr His lie He Val Gin Leu Gly \sn Met Ser Lys Asp Ser Ser 

2005 2010 2015 

He Lys He Thr Thr Val Ser Leu Ser Ala Fro Asp Ala ~eu Lys He 

2C20 2025 2030 

lie Thr GIu Asn Asp His Val Leu Leu Phe Trp Lys Ser Leu Ala Leu 
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2035 2040 2045 

Lys Glu Lys His Phe Asn Glu Ser Arg Gly Tyr 31 u He His Met Phe 

2050 2055 2060 

Asp Ser Ala Met Asn He Thr Ala Tyr Leu Gly Asn Thr Thr Asp Asn 
2065 2070 2075 2080 

Phe Phe Lys He Ser Asn Leu Lys Met Gly His Asn Tyr Thr Phe Thr 

2085 2090 2095 

Val Gin Ala Arg Cys Leu Phe Gly Asn Gin He Cys Gly Glu Pro Ala 

2100 2105 2110 

He Leu Leu Tyr Asp Glu Leu Gly Ser Gly Ala Asp Ala Ser Ala Thr 

2115 2120 2125 

Gin Ala Ala Arg Ser Thr Asp Val Ala Ala Val Val Val Pro He Leu 

2130 2135 2140 

Phe Leu He Leu Leu Ser Leu Gly Val Gly Phe Ala He Leu Tyr Thr 
2145 2150 2155 2160 

Lys His Arg Arg Leu Gin Ser Ser Phe Thr Ala Phe Ala Asn Ser His 

2165 2170 2175 

Tyr Ser Ser Arg Leu Gly Ser Ala He Phe Ser Ser Gly Asp Asp Leu 

2180 2185 2190 

Gly Glu Asp Asp Glu Asp Ala Pro Met He Thr Gly Phe Ser Asp Asp 

2195 2200 2205 

Val Pro Met Val He Ala 
2210 

Sequence ID No. 7 
Length of the Sequence: 6843 
Type: nucleic acid 
Strandedness : double 
Topology : linear 
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Molecular type: cDNA to mRNA 
Feature: 

Name/Key: sig peptide 

Location: 81.. 164 

Identification method: S 

Name /Key : mat peptide 

Location: 165 . . 6722 

Identification method: S 
Sequence : 

CCG GCCCAGCGGC TCTCCTGGCC 
TCGCGCTCCA CATTCTCTCC TGGCGGCGGC GCCACCTGCA GTAGCGTTCG CCCGAACATG 

Met 
1 

GCG ACA CGG AGC AGC AGG AGG GAG TCG CGA CTC CCG TTC C?A TTC ACC 
Ala Thr Arg Ser Ser Arg Arg Glu Ser Arg Leu Pro Phe Lea Phe Thr 

5 10 15 

CTG GTC GCA CTG CTG CCG CCC GGA GCT CTC TGC GAA GTC TGG ACG GAG 
Vai Ala Leu Leu Pro Pro Gly Ala Leu Cys Glu Val Trp Thr Gin 
20 25 30 

AGG CTG CAC GGC GGC AGC GCG CCC TTG CCC CAG GAC CGG GGC TTC CTC 
Arg Leu His Gly Gly Ser Ala Pre Leu Pro Gin Asp Arg Gly Phe Leu 

35 40 45 

GTG GTG CAG GGC GAC CCG GGC GAG CTG CGG CTG TGG GCG CGC GGG GAT 
Val Val Gin Gly Asp Pro Arg Glu Leu Arg Leu Trp Ala Arg Gly Asp 
50 55 63 65 

GGC AGG GGG GCG AGC CGC GCG GAC GAG AAG CCG CTC CGG AGG AAA CGG 
Ala Arg Gly Ala Ser Arg Ala Asp Glu Lys Pro Leu Arg Arg Lys Arg 
70 75 80 



52 



EP 0 773 290 A2 



AGC GCT GCC CTG CAG CCC GAG CCC ATC AAG GTG TAC GGA CAG GTT AGT 371 
Ser Ala Ala Leu Gin Pro Glu Pro He Lys Val Tyr Gly Gin Val Ser 

85 90 95 

CTG AAT GAT TCC CAC AAT CAG ATG GTG GTG CAC TGG GCT GGA GAG AAA 419 
10 Leu Asn Asp Ser His Asn Gin Met Val Val His Trp Ala Gly Glu Lys 

100 105 110 

AGC AAC GTG ATC GTG GCC TTG GCC CGA GAT AGC CTG GCA TTG GCG AGG 467 
' 5 Ser Asn Val He Val Ala Leu Ala Arg Asp Ser Leu Ala Leu Ala Arg 

■15 120 125 

CCC A AG AGC AGT GAT GTG TAC GTG TCT TAC GAC TAT GGA AAA TCA TTC 515 
20 Pro Lys Ser Ser Asp Val Tyr Val Ser ~yr Asp Tyr Gly Lys Ser Phe 

130 135 140 145 

AAG AAA ATT TCA GAC AAG TTA AAC TTT GGC TTG GGA AAT AGG AGT GAA 563 

25 

Lys Lys lie Ser Asp Lys Leu Asn Phe Gly Leu Gly Asn Arg Ser Glu 

150 155 160 

GCT olT ATC GCC CAG TTC TAC CAC AGC CCT GCG GAC AAC AAG CGG TAC 611 
Ala val lie Ala Gin Phe Tyr His Ser Pro Ala Asp Asn Lys Arg Tyr 

165 170 175 

. TC ^ GCA GAC GCT TAT GCC CAG TAC CTC TGG klC ACG TTT GAC TTC 659 

He Phe Ala Asp Ala Tyr Ala Gin Tyr Leu Trp He Thr Phe Asp Phe 
ISO 185 190 

40 r GC AAC AC T CTT CAA GGC TTT TCC ATC CCA TTT CGG GCA CCT GAT CTC 707 

Cys Asn Thr Leu Gin Gly Phe Ser He Pro Phe Arg Ala Ala Asp Leu 

195 200 205 

CTC CTA CAC AGT AAG GCC TCC AAC CTT CTC TTG GGC TTT GAC AGG TCC 755 
Leu Leu His Ser Lys Ala Ser Asn Leu Leu Leu Gly Phe Asp Arg Ser 
2!0 215 220 225 

50 

CAC CCC AAC AAG CAG CTG TGG AAG TCA GAT GAC TTT GGC CAG ACC TGG 803 
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His Pro Asn Lys Gin Leu Trp Lys Ser Asp Asp Phe Gly Gin Thr Trp 

230 235 240 

ATC ATG ATT CAG GAA CAT GTC AAG TCC TTT TCT TGG GCA ATT GAT CCC 851 
He Met He Gin Glu His Val Lys Ser Phe Ser Trp Gly He Asp Pro 

245 250 255 

TAT GAC AAA CCA AAT ACC ATC TAC ATT GAA CCA CAC GAA CCC TCT GGC 899 
Tyr Asp Lys Pro Asn Thr He Tyr He Glu Arg His Glu P'o Ser Gly 

260 265 270 

TAC TCC ACT GTC TTC CGA ACT ACA GAT TTC TTC CAG TCC CGG GAA AAC 947 
Tyr Ser Thr Val Phe Arg Ser Thr Asp Phe Phe Gin Ser Arg Glu Asn 

275 280 285 

CAG GAA GTG ATC CTT GAG GAA GTG AGA GAT TTT CAG CTT CGG GAC AAG 995 
Gin Glu Val He Leu G!u Glu Val Arg Asp Phe Gin Leu Arg Asp Lys 
290 295 300 305 

TAC ATG TTT GCT ACA AAG GTG GTG CAT CTC TTG GGC AGT GAA CAG CAG 1043 
20 Tyr Met Phe Ala Thr Lys Val Val His Leu Leu Gly Ser G!u Gin Gin 

310 315 320 

TCT TCT GTC CAG CTC TGG GTC TCC TTT GGC CGG AAG CCC ATG ACA GCA 1091 
Ser Ser Val Gin Leu Trp Val Ser Phe Gly Arg Lys Pro Met Arg Ala 

325 230 335 

GCC CAG TTT GTC ACA AGA CAT CCT ATT AAT GAA TA~ TAC ATC GCA GAT 1139 
Ala Gin Phe Val Thr Arg His Pro He Asn Glu Tyr Tyr He Ala Asp 

343 345 350 

GCC TCC GAG GAC CAG G~G TTT GTG TGT GTC AGC GAC AGT AAC AAC CGC 1187 
Ala Ser Glu Asp Gin Val Phe Val Cys Val Ser His Ser Asn Asn Arg 
355 360 365 

so ACC AAT TTA TAC ATC TCA GAG GCA GAG GGG CTG AAG TTC TCC CTG TCC 1235 

Thr Asn Leu Tyr He Ser Glu Ala Glu Gly Leu Lys Phe Ser Leu Ser 
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370 375 380 385 

TTG GAG AAC GTG CTC TAT TAC AGC CCA GGA GGG GCC GGC AGT GAC ACC 1283 

Leu Glu Asn Vai Leu Tyr Tyr Ser Pro Gly Gly Ala Gly Ser Asp Thr 

390 395 400 

TTG GTG AGG TAT TTT GCA AAT GAA CCA TTT GCT GAC TTC CAC CGA GTG .331 

10 

Leu Val Arg Tyr Phe Ala Asn Glu Pro Phe Ala Asp Phe His Arg Val 
405 410 415 

'5 GAA GGA TTG CAA GGA GTC TAC ATT GCT ACT CTG ATT AAT GGT TCT ATG 1379 

Glu Gly Leu Gin Gly Val Tyr lie Ala Thr Leu lie Asn Gly Ser Met 
420 425 430 

20 AAT GAG GAG AAC ATG AGA TCG GTC ATC ACC TTT CAC AAA GGG GGA ACC 1427 

Asn Glu Glu Asn Met Arg Ser Val He Thr Phe Asp Lys Gly Gly Thr 
435 440 445 

25 

TGG GAG TTT CTT CAG GCT CCA GCC TTC ACG GGA TAT GGA GAG AAA ATC 1475 

Trp Glu Phe Leu Gin Ala Pro Ala Phe Thr Gly Tyr Gly Glu Lys He 

450 455 460 465 

AAT TGT GAG CTT TCC CAG GGC TGT TCC CTT CAT CTG GCT CAG CGC CTC 1523 

Asn Cys Glu Leu Ser Gin G!" Cys Ser Leu His Leu Ala Gin Arg Leu 

470 475 480 

AGT CAG CTC CTC AAC CTC CAG CTC CGG AGA ATG CCC ATC CTG TCC AAG : 57 1 
Ser Gin Leu Leu Asn Leu Gin Leu Arg Arg Met Pro He Leu Ser Lys 

485 490 495 

GAG TCG GCT CCA GGC CTC ATC ATC GCC ACT GGC TCA GTG GGA AAG AAC ;619 
45 Glu Ser Ala Pro Gly Leu He He Ala Thr Gly Ser Val Gly Lys Asn 

500 505 510 

TG GCT AGC AAG ACA AAC GTG TAC ATC TCT AGC AGT GCT GGA GCC AGG 1667 
« Leu Ala Ser Lys Thr Asn Val Tyr He Ser Ser Ser Ala Gly Ala Arg 

515 520 525 
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TGG CGA GAG GCA CTT CCT GGA CCT CAC TAC TAC ACA TGG GGA GAC CAC 1715 

Trp Arg Glu Ma Leu Pro Gly Pro His Tyr Tyr Thr Trp Gly Asp His 

530 535 540 545 

GuC GGA ATC ATC ACG GCC ATT GCC CAG GGC ATG GAA ACC AAC GAG CTA 1763 

Gly G'.y He He Thr Ala lie Ala Gin Gly Met Glu Thr Asn Glu Leu 

550 555 560 

AAA TAC ACT ACC AAT GAA GGG GAG ACC TGG AAA ACA TTC ATC TTC TCT 1811 
Lys Tyr Ser Thr Asn Glu Gly Glu Thr Trp Lys Thr Phe He Phe Ser 

565 570 575 

GAG AAG CCA GTG TTT GTG TAT GGC CTC CTC ACA GAA CCT GGG GAG AAG 1859 

CI j Lyb Pro Val Phe Val Tyr Gly Leu Leu Thr Glu Pro Gly Glu Lys 

530 585 59C 

AGC ACT CTC TTC ACC ATC TTT GGC TCG AAC AAA GAG AAT GTC CAC AGC 1S07 
Ser Thr Va! -he Thr He Phe Gly Ser Asn Lys Glu Asn Val His Ser 

535 600 605 

-CG.:~3 \TC ZTC CAG GTC AAT GCC ACG GAT GCC TTC GGA GTT CCC TGC 1S55 
~rp l.eu He Leu Gin Val Asn Ala ""hr Asp Ala Leu Gly Val Pro Cys 
51C 615 620 625 

ACA GA3 AAT GAC TAC AAG CTG TGG ~CA CCA 7"CT GAT GAG CGG GGG AAT 20C3 
Thr Asn \sp Tyr Lys Leu Trp Ser Pro Ser Asp Glu Arg Gly Asn 

630 635 640 

GAG TGT TTG CTG GGA CAC AAG ACT GTT TTC AAA CGG CGG ACC CCC CAT 2051 
Glu Cys Leu Leu Gly His Lys Thr Val Phe Lys Arg Arg Thr Pro His 

545 650 655 

GCC ACA TCC TTC AAT GGA GAG GAC TT GAC AGG CCG GTG GTC GTG TCC 2C99 
Ala Th.r Cys Phe Asn Gly Glu Asp Phe Asp Arg Pro Val Val Val Ser 

660 665 670 

AAC TGC TCC TGC ACC CGG GAG GAC TAT GAG TGT GAC TTC GGT TTC AAG 2147 
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Asn Cys Ser 

675 
ATG AGT GAA 

Met Ser Glu 
690 

TCT GGA AAG 
Ser Gly Lys 

TAG AGG AGA 
Tyr Arg Arg 



Cys Thr Arg Glu Asp 

680 

GAT TTG TCA TTA GAG 
Asp Leu Ser Leu Glu 
695 

TCA TAC TCC CCT CCT 
Ser Tyr Ser Pro Pro 
710 

ACG AGA GGC TAC CGG 
^hr Arg Gly Tyr Arg 
725 



GGA GGA GAT 
Cly Gly Asp 
740 

CTG GCA GAA 
Leu Ala Glu 

755 
TAC CGC TAT 
Tyr Arg Tyr 
770 

GGG CTA CGG 
Gly Leu Arg 

TTG TAT TGG 
Leu Tyr Trp 

AAT GGA AGC 
Asn Gly Ser 



GTT GAA GCG CGA CTG 
Vai Glu Ala Arg Leu 
745 

GAG AAC GAG TTC ATT 
Glu Asn Glu Phe lie 
760 

GAC CTG GCC TCG GGA 
Asp Leu Ala Ser Gly 
775 

GCA GCA GTG GCC CTG 
Ma Ala Val Ala Leu 
790 

TCC GAC CTG GCC TTG 
Ser Asp Leu Ala Leu 
305 

ACA GGG CAA GAG GTG 
Thr Gly Gin Glu Val 



2243 



2291 



Tyr Glu Cys Asp Phe Gly Phe Lys 
685 

GTT TGT GTT CCA GAT CCG GAA TTT 2195 
Val Cys Val Pro Asp Pro Glu Phe 
700 705 
GTG CCT TGC CCT GTG GGT TCT ACT 
Val Pro Cys Pro Val Gly Ser Thr 

715 720 
AAG ATT TCT GGG GAC ACT TGT AGC 
Lys lie Ser Gly Asp Thr Cys Ser 
730 735 
GAA GGA GAG CTG GTC CCC TGT CCC 2339 
Glu G!y Glu Leu Val Pro Cys Pro 
750 

CTG TAT GCT GTG AGG AAA TCC ATC 2387 
Leu Tyr Ala Val Arg Lys Ser lie 
765 

GCC ACC GAG CAG TTG CCT CTC ACC 2435 
Ala Thr Glu Gin Leu Pro Leu Thr 
780 785 
GAC TTT GAC TAT GAG CAC AAC TGT 2483 
Asp Phe Asp Tyr Glu His Asn Cys 

795 800 
GAC GTC ATC CAG CGC CTC TGT TTG 2531 
Asp Val lie Gin Arg Leu Cys Leu 
810 815 
ATC ATC AAT TCT GGC CTG GAG ACA 2579 
He lie Asn Ser Gly Leu Glu Thr 
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320 825 830 

GTA GAA GCT TTG GCT TTT GAA CCC CTC AGC CAG CTG CTT TAC TGG GTA 2627 

Val Glu Ala Leu Ala Phe Glu Pro Leu Ser Gin L,eu Leu Tyr Trp Val 

835 840 345 

GAT GCA GGC TTC AAA AAG ATT GAG GTA GCT AAT CCA GAT GGC GAC TTC 2675 

10 

Asp Ala Gly Phe Lys Lys He Glu Val Ala Asn Pro Asp Gly Asp Phe 

850 855 860 865 

is CGA CTC ACA ATC GTC AAT TCC ~CT GTG CTT GAT CGT CCC AGG GCT CTG 2723 

Arg Leu Thr He Val Asn Ser Ser Val Leu Asp Arg Pro Arg Ala Leu 

870 875 880 

GTC CTC GTG CCC CAA GAG GGG GTG ATG TTC TGG ACA GAC TGG GGA GAC 2771 

Val Leu Val Pro Gin Glu Gly Val Met Phe Trp Thr Asp Trp Gly Asp 

2$ 885 890 895 

CTG AAG CCT GGG ATT TAT CGG AGC AAT ATG GAT CGT TCT GCT GCC TAT 2819 
Leu Lys Pro Gly He Tyr Arg Ser Asn Met Asp Gly Ser Ala Ala Tyr 

900 905 910 

CAC CTG GTG TCT GAG GAT GTG AAG TGG CCC AAT GGC ATC TCT GTG GAC 2867 
His Leu Val Ser Glu Asp Val Lys Trp Pro Asn Gly lie Ser Val Asp 

915 920 925 

GAC CAG TGG ATT TAC TGG ACG GAT GCC TAC CTG GAG TGC ATA GAG CGG 2915 
Asp Gin Trp lie Tyr Trp Thr Asp Ala Tyr Leu Glu Cys He Glu Arg 

40 

930 935 940 945 

ATC ACG TTC ACT GGC CAG CAG CGC TCT GTC ATT CTG GAC AAC CTC CCG 2963 
45 ile Thr Phe Ser Gly Gin Gin Arg Ser Val He Leu Asp Asn Leu Pro 

950 955 960 

CAC CCC TAT GCC ATT GCT GTC TTT AAG AAT GAA ATC TAC TGG GAT GAC 3011 
50 His Pro Tyr Ala Ile Ala Val Phe Lys Asn Glu Ile Tyr Trp Asp Asp 

965 S70 975 
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25 



20 



J5 



TGG TCA CAG CTC AGC ATA TTC CGA GCT TCC AAA TAC AGT GGG TCC CAG 
Trp Ser Gin Leu Ser He Phe Arg Ala Ser Lys Tyr Ser Gly Ser Gin 

980 985 990 

ATG GAG ATT CTG GCA AAC CAG CTC ACG GGG CTC ATG GAC ATG AAG ATT 
Met Glu lie Leu Ala Asn Gin Leu Thr Gly Leu Met Asp Met Lys He 

995 1000 1005 

TTC TAC AAG GGG AAG AAC ACT GGA AGC AAT GCC TGT GTG CCC AGG CCA 
Phe Tyr Lys Gly Lys Asn Thr Gly Ser Asn Ala Cys Val Pro Arg ^ro 
1010 1015 1020 1025 

TGC AGC CTG CTG TGC CTG CCC AAG GCC AAC AAC AGT AGA AGC TGC AGG 
Cys Ser Leu Leu Cys Leu Pro Lys Ala Asn Asn Ser Arg Ser Cys Arg 

1030 1035 1040 

TGT CCA GAG GAT GTG TCC AGC AGT GTG CTT CCA TCA GGG GAC CTG ATG 
Cys Pro Glu Asp Val Ser Ser Ser Val Leu Pro Ser Gly Asp Leu Met 

1045 1050 1055 

TGT GAC TGC CCT CAG GGC TAT CAG CTC AAG AAC AAT ACC TGT GTC AAA 
Cys Asp Cys Pro Gin Gly Tyr Gin Leu Lys Asn Asn Thr Cys Val Lys 

1C60 1065 1070 

GAA GAG AAC ACC TGT CTT CGC AAC CAG TAT CGC TGC ACC AAC GGG AAC 
Glu Glu Asn Thr Cys Leu Arg Asn Gin Tyr Arg Cys Ser Asn Gly Asn 

1075 1C80 1085 

TGT ATC AAC AGC ATT TGG TGG TGT GAC TTT GAC AAC GAC TGT GGA GAC 
Cys lie Asn Ser He Trp Trp Cys Asp Phe Asp Asn Asp Cys Gly Asp 
1090 1095 H00 H05 

ATG AGC GAT GAG AGA AAC TGC CCT ACC ACC ATC TGT GAC CTG GAC ACC 
Met Ser Asp Glu Arg Asn Cys Pro Thr Thr He Cys Asp Leu Asp Thr 

1110 H15 H20 

CAG TTT CGT TGC CAG GAG TCT GGG ACT TGT ATC CCA CTG TCC TAT AAA 



3059 
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Gin Phe Arg Cys Gin Glu Ser Gly Thr Cys lie Pro Leu Ser Tyr Lys 

H25 H30 1135 

TGT CAC CTT GAG GAT GAC TGT GGA GAC AAC AGT GAT GAA AGT CAT TGT 3339 

Cys Asp Leu Glu Asp Asp Cys Gly Asp Asn Ser Asp Glu Ser His Cys 

1140 1145 1150 

GAA ATG CAC CAG TGC CGG AGT GAC GAG TAC AAC TGC AGT TCC GGC ATG 3587 

Glu Met His Gin Cys Arg Ser Asp Glu Tyr Asn Cys Ser Ser Gly Met 

1155 1160 1165 

TGC ATC CGC TCC TCC TGG GTA TGT GAC GGG GAC AAC GAC TGC AGG GAC 3635 

Cys He Arg Ser Ser Trp Val Cys Asp Gly Asp Asn Asp Cys Arg Asp 
1170 1175 1180 1135 

TGG TCT GAT GAA GCC AAC TGT ACC GCC ATC TAT CAC ACC TGT GAG GCC 3683 

Trp Ser Asp Glu Ala Asn Cys Thr Ala He Tyr His Thr Cys Glu Ala 

1190 1195 1200 

TCC AAC TTC CAG TGC CGA AAC GGG CAC TGC ATC CCC CAG CGG TGG GCG 3731 
Ser Asn Phe Gin Cys Arg Asn Gly His Cys Me Pro Gin Arg Trp Ala 

12G5 1213 1215 

TGT GAC GGG GAT ACG GAC TGC CAG GAT GGT TCC GAT GAG GAT CCA GTC 3779 
Cys Asp Gly Asp Thr Asp Cys Gin Asp Gly Ser Asp Glu Asp Pro Val 

1220 1225 1230 

AAC TGT GAG AAG AAG TGC AAT GGA TTC CGC TGC CCA AAC GGC ACT TGC 3827 
Asn Cys Glu Lys Lys Cys Asn Gly Phe Arg Cys Pro Asn Giy Thr Cys 

1235 1240 1245 

ATC CCA TCC AGC AAA CAT TGT GAT GGT CTG CCT GAT TGC TCT GAT GGC 3875 
I'.e Pro Ser Ser Lys His Cys Asp Gly Leu Arg Asp Cys Ser Asp Gly 
1250 1255 1260 1265 

TCC GAT GAA CAG CAC TGC GAG CCC CTC TGT ACG CAC TTC ATG GAC TTT 3923 
Ser Asp Glu Gin His Cys Glu Pro Leu Cys Thr His Phe Met Asp Phe 
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SO 



1270 1275 1280 

GIG TGT AAG AAC CGC CAG CAG TGC CTG TTC CAC TCC ATG GTC TGT GAC 3971 
Va! Cys Lys Asn Arg Gin Gin Cys Leu Phe His Ser Met Val Cys Asp 

1285 1230 1295 

GGA ATC ATC CAG TGC CGC GAC GGG TCC GAT GAG GAT GCG GCG TTT GCA 4CL9 
Gly He He Gin Cys Arg Asp Gly Ser Asp Glu Asp Ala Ala Phe Ala 

1300 1305 1310 

GGA TGC TCC CAA GAT CCT GAG TTC CAC AAG GTA TGT GAT GAG TTC GGT 4C67 
Gly Cys Ser Gin Asp Pro Glu Phe His Lys Va! Cys Asp Glu Phe Gly 

1315 1320 1325 

TTC CAG TGT CAG AAT GGA GTG TGC ATC AGT TTG ATT TGG AAG TGC GAC 4115 
Phe Gin Cys Gin Asn Gly Val Cys lie Ser Leu He Trp Lys Cys Asp 
1330 1335 1340 1345 

GGG ATG GAT GAT ~GC GGC GAT TAT TCT GAT GAA GCC AAC TGC GAA AAC 4163 
Gly Met Asp Asp Cys Gly Asp Tyr Ser Asp Glu Ala Asn Cys Glu Asn 

1350 1355 1360 

CCC ACA GAA GCC CCA AAC TGC TCC CGC TAC TTC CAG TTT CGG TGT GAG 4211 
Pro Thr Glu Ala Pro Asn Cys Ser Arg Tyr Phe Gin :>he Arg Cys Glu 

1365 1370 1375 

AAT GGC CAC TGC ATC CCC AAC AGA TGG AAA TGT GAC AGG GAG AAC GAC 4259 
Asn Gly His Cys He Pro Asn Arg Trp Lys Cys Asp Arg Glu Asn Asp 

1380 1385 1390 

TGT GGG GAC TGG TCT GAT GAG AAG OA? TGT GGA GAT TCA CAT ATT CTT 4307 
Cys Gly Asp Trp Ser Asp Glu Lys Asp Cys Gly Asp Ser His lie Leu 

1395 1400 1405 

CCC TTC TCG ACT CCT GGG CCC TCC ACG TGT CTG CCC AAT TAC TAC CGC 4355 
Pro Phe Ser Thr Pro Gly Pro Ser Thr Cys Leu Pro Asn Tyr Tyr Arg 
1410 1415 1420 1425 
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TGC AGC AGT GGG ACC TGC GTG ATG GAC ACC TGG GTC TGC GAC GGG TAC 4403 
Cys Ser Ser Gly Thr Cys Val Met Asp Thr Trp Val Cys Asr Gly Tyr 

1430 1435 1440 

CGA GAT TGT GCA GAT GGC TCT GAC GAG GAA GCC TGC CCC TTC CTT GCA 445: 
Arg Asp Cys Ala Asp Gly Ser Asp Glu Glu Ala Cys Pro Leu Leu Aia 

1445 1450 1455 

AAC GTC ACT GCT GCC TCC ACT CCC ACC CAA CTT GGG CGA TGT GAC CGA 4499 
Asa Val Thr Ala Ala Ser Thr Pro Thr Gin Leu Gly Arg Cys Asp Arg 

1460 1465 1470 

TTT GAG TTC GAA TGC CAC CAA CCC AAC ACC TGT ATT CCC AAC TGG AAG 4547 
Phe Glu Phe Glu Cys His Gin Pro Lys Thr Cys He Pro Asn Trp Lys 

1475 1480 1485 

CGC TGT GAC GGC CAC CAA GAT TGC CAG GAT GGC CGG GAC GAG GCC AAT 4595 
Arg Cys As? Gly His Gin Asp Cys Gin Asp Gly Arg Asp Glu Ala Asn 
1490 1495 1500 1505 

TGC CCC ACA CAC AGC ACC TTC ACT TGC ATG AGC AGG GAG TTC CAG TGC 4643 
Cys Pro Thr His Ser Thr Leu ~hr Cys Met Ser Arg Glu Phe Gin Cys 

1510 1515 1520 

GAG GAC GGG GAG GCC TGC ATT GTG CTC TCG GAG CGC TGC GAC GGC TTC 4691 
Glu Asp Gly Glu Ala Cys He Val Leu Ser Glu Arg Cys Asp Gly Phe 

1525 1530 1535 

CTG GAC TGC TCG GAC GAG AGC GAT GAA AAG GCC TGC AGT GAT GAG TTG 4739 
Leu Asp Cys Ser Asp Glu Ser Asp Glu Lys Ala Cys Ser Asp Glu Leu 

1540 1545 1550 

ACT GTG TAC AAA GTA CAG AAT CTT CAG TGG ACA GCT GAC TTC TCT GGG 4787 
Thr Val Tyr Lys Val Gin Asn Leu Gin Trp Thr Ala Asp Phe Ser Gly 

1555 1560 1565 

GAT GTG ACT TTG ACC TGG ATG AGG CCC AAA AAA ATG CCC TCT GCA TCT 4835 
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20 



25 



45 



SO 



Asp Val Thr Leu Thr Trp Met Arg Pro Lys Lys Met Pro Ser Ala Ser 

1570 1575 1580 1585 

TGT GTA TAT AAT GTC TAC TAC AGG GTG GTT GGA GAG AGC ATA TGG AAG 4883 

Cys Val Tyr Asn Val Tyr Tyr Arg Val Val Gly Glu Ser He Trp Lys 

1590 1595 1600 

ACT CTG GAG ACC CAC AGC AAT AAG ACA AAC ACT GTA TTA AAA GTC TTG 4931 
Thr Leu Glu Thr His Ser Asn Lys Thr Asn Thr Val Leu Lys Val Leu 

1605 1610 1615 

AAA CCA GAT ACC ACG TAT CAG GTT AAA GTA CAG GTT CAG TGT CTC ACC 4979 
Lys Pro Asp Thr Thr Tyr Gin Val Lys Val Gin Val Gin Cys Leu Ser 

1620 1625 1630 

AAG GCA CAC AAC ACC AAT GAC TTT GTG ACC CTG AGG ACC CCA GAG GGA 5027 
Lys Ala His Asn "hr Asn Asp Phe Val Thr Leu Arg Thr Pro Glu Gly 

1635 1640 1645 

TTG CCA GAT GCC CCT CGA AAT CTC CAG CTG TCA CTC CCC AGG GAA GCA 5075 
Leu Pro Asp Ala Pro Arg Asn Leu Gin Leu Ser Leu Pro Arg Glu Ala 
1650 1655 166C 1665 

GAA GGT GTG ATT GTA GGC CAC TGG GCT CCT CCC ATC CAC ACC CAT GGC 5123 
Glu Gly Val lie Val Gly His Trp Ala Pro Pro He His Thr His Gly 

1670 1675 1680 

CTC ATC CGT GAG TAC ATT GTA GAA TAC AGC AGG ACT GGT TCC AAG ATG 
Leu He Arg Glu Tyr He Val Glu ~yr Ser Arg Ser Gly Ser Lys Met 

1685 '-690 1695 

TGG GCC TCC CAG AGG GCT GCT AGT AAC TTT ACA GAA ATC AAG AAC TTA 
Trp Ala Ser Gin Arg Ala Ala Ser Asn Phe Thr Glu lie Lys Asn Leu 

1700 1705 1710 

TTG GTC AAC ACT CTA TAC ACC GTC AGA GTG GCT GCG GTG ACT ACT CGT 5267 
Leu Val Asn Thr Leu Tyr Thr Val Arg Val Ala Ala Val Thr Ser Arg 
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^ T 1 5 1720 1725 

GGA ATA GGA AAC TGG AGC GAT TCT AAA TCC ATT ACC ACC ATA AAA GGA 5315 
Gly He Gly Asn Trp Ser Asp Ser Lys Ser He Thr Thr lie Lys Gly 
1730 17 35 1740 1745 

AAA GTG ATC CCA CCA CCA GAT ATC CAC ATT GAC AGC TAT GGT GAA AAT 5363 
Lys Val He Pro Pro Pro Asp lie Kis He Asp Ser Tyr Gly GIu Asn 

1750 1755 1760 

TAT C~A AGC TTC ACC CTG ACC ATG GAG AGT GAT ATC AAG GTG AAT GGC 5411 
Tyr Leu Ser ?he Thr Leu Thr Met Glu Ser Asp He Lys Val Asn Gly 

1765 1770 1775 

TAT G~G GTG AAC CTT TTC TGG GCA TTT GAC ACC CAC AAG CAA GAG AGG 5459 
Tyr Val Val Asn Leu Phe Trp Ala Phe Asp Thr His Lys Gin Glu Arg 

1780 1735 1790 

AGA ACT TTG AAC TTC CGA GGA AGC ATA TTG TCA CAC AAA GTT GGC AAT 5507 
Arg Thr Leu Asn Phe Arg Gly Ser He Leu Ser His Lys Val Gly Asn 

1795 1800 1805 

CTG ACA GCT CAT ACA TCC TAT GAG ATT TCT GCC TGG GCC AAG ACT GAC 5555 
Leu Thr Ala His Thr Ser Tyr Glu lie Ser Ala Trp Ala Lys Thr Asp 
1810 1315 1820 1825 

TTG GGG GAT AGC CCT CTG GCA TTT GAG CAT GTT ATG ACC AGA GGG GTT 5603 
Leu Gly Asp Ser Pro Leu Ala Phe Glu His Val Met Thr Arg Gly Val 

1830 1835 1840 

CGC CCA CCT GCA CCT AGC CTC AAG GCC AAA GCC ATC AAC CAG ACT GCA 5651 
Arg Pro Pro Ala Pro Ser Leu Lys Ala Lys Ala He Asn Gin Thr Ala 

1845 1350 1855 

GTG GAA TGT ACC TGG ACC GGC CCC CGG AAT GTG GTT TAT GGT ATT TTC 5699 
Val Glu Cys Thr Trp Thr Gly Pro Arg Asn Val Val Tyr Gly He Phe 
1860 1865 1870 
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TAT GCC ACG TCC TTT CTT GAC CTC TAT CGC AAC CCG AAG AGC TTG ACT 5747 
Tyr Ala Thr Ser Phe Leu Asp Leu Tyr Arg Asn Pro Lys Ser Leu Thr 

1875 1880 1885 

ACT TCA CTC CAC AAC AAG ACG GTC ATT GTC AGT AAG GAT GAG CAG TAT 5795 
Thr Ser Leu His Asn Lys Thr Val He Val Ser Lys Asp Glu Gin Tyr 
1890 1895 1900 1905 

TTG TTT CTG GTC CGT GTA GTG GTA CCC TAC CAG GGG CCA TCC TCT GAC 5843 
Leu Phe Leu Val Arg Val Val Val Pro Tyr Gin Gly Pro Ser Ser Asp 

1910 1915 1920 

TAC CTT GTA GTG AAG ATG ATC CCG GAC AGC AGG CTT CCA CCC CGT CAC 5891 
Tyr Val Val Val Lys Met He Pro Asp Ser Arg Leu Pro Pro Arg His 

1S25 1930 1935 

CTG CAT GTG GTT CAT ACG GGC AAA ACC TCC GTG GTC ATC AAG TGG GAA 5939 
Leu His Val Val His Thr Gly Lys Thr Ser Val Val He Lys Trp Glu 

1940 1945 1950 

TCA CCG TAT GAC TCT CCT GAC CAG GAC TTG TTG TAT GCA ATT GCA GTC 5987 
Ser Pro Tyr Asp Ser Pro Asp Gin Asp Leu Leu Tyr Ala lie Ala Val 

1955 1960 1965 

AAA GAT CTC ATA AGA AAG ACT GAC AGG AGC TAC AAA GTA AAA TCC CGT 6035 
Lys Asp Leu He Arg Lys Thr Asp Arg Ser Tyr Lys Val Lys Ser Arg 
;^7C 1975 1980 1985 

AAC AGC ACT GTG GAA TAC ACC CTT AAC AAG TTG GAG CCT GGC GGG AAA 6083 
Asn Ser Thr Val Glu Tyr Thr Leu Asn Lys Leu Glu Pro Gly Gly Lys 

1990 1995 2000 

TAC CAC ATC ATT GTC CAA CTG GGG AAC ATG AGC AAA GAT TCC AGC ATA 6131 
Tyr His He lie Val Gin Leu Gly Asn Met Ser Lys Asp Ser Ser He 

2005 2010 2015 

AAA ATT ACC ACA GTT TCA TTA TCA GCA CCT GAT GCC TTA AAA ATC ATA 6179 
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Lys He Thr Thr Va] Ser ~eu Ser Ala Pro Asp Ala Leu Lys He lie 

2020 2025 2030 

ACA GAA AAT GAT CAT GTT CTT CTG TTT TGG AAA AGC CTC GCT TTA AAG 6227 
Thr Glu Asn Asp His Val Leu Leu Phe Trp Lys Ser Leu Ala Leu Lys 

2035 2040 2045 

GAA AAG CAT TTT AAT GAA AGC AGG GGC TAT GAG ATA CAC ATG TTT GAT 6275 
Glu Lys His Phe Asn Glu Ser Arg Gly Tyr Glu He His Met Phe Asp 
2050 2055 2060 2065 

AGT GCC ATG AAT ATC ACA GCT TAC CTT GGG AAT ACT ACT GAC AAT TTC 6323 
Ser Ala Met Asn He Thr Ala Tyr Leu Gly Asn Thr Thr Asp Asn Phe 

2070 2075 2080 

TTT AAA ATT TCC AAC CTG AAG ATG GGT CAT AAT TAC ACG TTC ACC GTC 6371 
Phe Lys lie Ser Asn Leu Lys Met Gly His Asn Tyr Thr Phe Thr Val 

2085 2090 2095 

CAA GCA AGA TGC CTT TTT GGC AAC CAG ATC TG" GGG GAG CCT GCC ATC 6413 
Gin Ala Arg Cys Leu Phe Gly Asn Gin He Cys Gly Glu Pro Ala He 

2100 2105 2110 

CTG CTG TAC GAT GAG CTG GGG TCT GGT GCA GA" GCA TCT GCA \CG CAG 6467 
Leu Leu T^yr Asp Glu Leu Gly Ser Gly Ala Asp Ala Ser Ala Thr Gin 

2115 2120 2125 

GCT GCC AGA TCT ACG GAT GTT GCT GCT GTG jTG GTG CCC ATC TTA TTC 6515 
Ala Ala Arg Ser Thr Asp Val Ala Ala Val Val Val Pro He Leu Phe 
2130 2135 2140 2145 

CTG ATA CTG CTG AGC CTG GGG GTG GGG TTT GCC ATC CTG TAC ACG AAG 6563 
Leu He Leu Leu Ser Leu Gly Val Gly Phe Ala He Leu Tyr Thr Lys 

2150 2155 2160 

CAC CGG AGG CTG CAG AGC AGC TTC ACC GCC TTC GCC AAC AGC CAC "AC 6611 
His Arg Arg Leu Gin Ser Ser Phe Thr Ala Phe Ala Asn Ser His Tyr 
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2165 2170 2175 

AGC TCC AGG CTG GGG TCC GCA ATC TT: TCC TCT GGG GAT GAC CTG GGG 6659 
Ser Ser Ar g Leu Gly Ser Ala lie Phe Ser Ser Gly Asp Asp Leu Gly 

2180 2185 2190 

GAA GAT GAT GAA GAT GCC CCT ATG ATA ACT GGA TTT TCA GAT GAC GTC 6707 
Glu Asp Asp Glu Asp Ala Pro Met He Thr Gly Phe Ser Asp Asp Val 

2195 2200 2205 

CCC ATG GTG ATA GCC TGAAAGAGCT TTCCTCACTA GAAACCAAAT GGTGTAAATA 6762 
Pro Met Val lie Ala 
2210 

TTTTATTTGA TAAAGATAGT TGATGGTTTA TTTTAAAAGA TGCACTTTGA GTTGCAATAT 6822 
GTTATTTTTA TATGGGCCAA A 6843 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME: KOWA CO., LTD . 

(B> STREET: 6-29, Nishiki 3-chome, Naka-ku, Nagoya-shi, 
IC) CITY: Aichi 
(E) COUNTRY: Japan 
io (F) POSTAL CODE (ZIP) : none 

(li) TITLE OF INVENTION: NOVEL LDL RECEPTOR ANALOG PROTEIN AND THE 
GENE CODING THEREFOR 

■ in) NUMBER OF SEQUENCES: 7 

15 

!iv) COMPUTER READABLE FORM : 

(A) MEDIUM TYPE: Floppy disk 

( B ) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC- DOS /MS - COS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 :EPO) 

20 

(2) INFORMATION FOR SEQ ID NO: 1: 

SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 6639 base pairs 
(3) TYPE: nucleic acid 
(C) STRANDEDNESS : double 
{ D) TOPOLOGY : linear 

li) MOLECULE TYPE: cDNA to mRNA 



SEQUENCE DESCRIPTION: SEQ ID NO: 1: 



40 



AT G-G C 3 A — r\ C 


3GAGCAGCAG 


GAGGGAGTCG 


CGACTCCCCT 


TCCTATTCAC 


CCTGGTCGCG 


SO 




CCGGGGCTCT 


CTGCGAGGTG 


TGGACGCGGA 


CACTGCACGG 


CGGCCGCGCG 


120 


CCCTTACCrC 


AGGAGCGGGG 


CTTCCGCGTG 


GTGCAGGGCG 


ACCCGCGCGA 


GCTGCGGCTG 


13 0 






GGGGGCGAGC 


CGGGCGGACG 


AGAAGCCGCT 


CCGGAGGAGA 


240 


CGGAG_ 3 CTG 


CCCTGCAGCC 


CGAGCCCATC 


AAGGTGTACG 


GACAGGTCAG 


CCTCAATGAT 


300 


TCCCACAATC 


AGATGGTGGT 


GCACTGGGCC 


GGAGAGAAAA 


GCAACGTGAT 


CGTGGCCTTG 


360 


GCCCCGCACA 


GCCTGGCGTT 


GGCCAGGCCC 


AGGAGCAGTG 


ATGTGTACGT 


GTCTTATGAC 


420 


TATGGAAAAT 


CATTCAATAA 


GATTTCAGAG 


AAATTGAACT 


TCGGCGCGGG 


AAATAACACA 


480 


GAGG CTGTGG 


TGGC C CAGTT 


CTACCACAGC 


CCTGCGG AC A 


A CAAACGG T A 


CATCTTCGCA 


540 


GATGCCTACG 


C-CAGTATCT 


CTGGATCACG 


TTTGACTTCT 


GCAACACCAT 


CCATGGCTTT 


600 


TCCATCCC3T 


TCCGGGCAGC 


TGATCTCCTA 


CTCCACAGTA 


AGGCCTCCAA 


CCTTCTCCTG 


660 


GGCTTCGACA 


GGTCTCACCC 


CAACAAGCAG 


CTGTGG AAGT 


CGGATGATTT 


TGGCCAGACC 


720 


TGGAT CATGA 


TTCAAGAACA 


CGTGAAGTCC 


TTTTCTTGGG 


GAATTGATCC 


CTATGACAAA 


780 
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CCAAACACCA TCTACATCGA ACGGCACGAA CCTT-rTGC-CT ACTCC\CGCT TTTCCGAAGT 84 0 

ACAGACTTCT TCCAGTCCCG CGAAAACCAG GAAGTGATCT TGGAGGAAGT GAGAGACTTT 900 

CAGCTTCGGG ACAAGTACAT GTTTGCTACA AAGGTGGTGC Al CTCTTGGG CAGTCCACTG 960 

CAGTCTTCTG TCCAGCTCTG GGTCTCCTTT GGCCGGAAGC CCATGCGGGC CGCCCAGTTT 1020 

GTTACAAGAC ATCCTATCAA CGAATATTAC ATCGC3GATG CCTCGGAGGA CCAGGTGTTT 10 80 

GTGTGTGTCA GTCACAGCAA CAACCGCACC AACCTCTACA TCTCGGAGGC AGAGGGCTTG 114 0 

AAGTTCTCTC TGTCCCTGGA GAACGTGCTC TACTACACCC CGGGAGGGGC CGGCAGTGAC 120 0 

ACCTTGGTGA GGTACTTTGC AAATGAACCG TTTGCTGACT TCCATCGTGT GGAAGGGTTG 12 60 

CAGGGAGTCT ACATTGCTAC TCTGATTAAT GGTTCTATGA ATGAGGAGAA CATGAGATCT 13 20 

GTCATCACCT TTGACAAAGG GGGCACCTGG GAATTTCTGC AGGCTCCAGC CTTCACGGGG 13 80 

TATGGAGAGA AAATCAACTG TGAGCTGTCC GAGGGCTGTT CCCTCCACCT GGCCCAGCGC 1440 

CTCAGCCAGC TGCTCAACCT CCAGCTCCGG AGGATGCCCA TCCTGTCCAA GGAGTCGGCG 1=30 

CCTGGCCTCA TCATTGCCAC GGGCTCAGTG GGAAAGAACT TGGCTAGCAA GACAAACGTG 13 60 

TACATCTCTA GCAGTGCTGG AGCCAGGTGG CGAGAGGCAC TTCCTGGACC TCACTACTAT 16 20 

ACATGGGGAG ACCATGGCGG CATCATCATG GCCATTGCCC AAGGCATGGA AACCAACGAA 16 30 

CTGAAGTACA GTACCAACGA AGGGGAGACC TGGAAAGCCT TCACCTTCTC TGAGAAGCCC 174 0 

GTGTTTGTGT ATGGGCTCCT CACGGAACCC GGCGAGAAGA GCACGGTCTT CACCATCTTT 18 00 

GGCTCCAACA AGGAGAACGT G CAC AGCTGG CTCATCCTCC AGGT CAATGC CACAGACGCC 1860 

CTGGGGGTTC CTTGCACAGA GAACGACTAC AAGCTCTGGT CACCATCTGA TGAGCGGGGG IS 20 

AATGAGTGTT TGCTTGGACA CAAGACTGTT TTCAAACGGA GGACCCCGCA CGCCACATGC 1930 

TTTAACGGAG AAGACTTTGA CAGGCCGGTG GTTGTGTCCA ACTGCTCCTG CACCCGGGAG 2C40 

GACTATGAGT GTGACTTTGG CTTCCGGATG AGTGAAGACT TGGCATTAGA GGTGTGT3TT 2100 

CCAGATCCAG GATTTTCTGG AAAGTCCTCC CCTCCAGTGC CTTGTCCCGT GGGCTCTACG 216 0 

TACAGGCGAT CAAGAGGCTA CCGGAAGATT TCTGGGGACA CCTGTAGTGG AGGAGATGTT 2 220 
GAGGCACCGC TAGAAGGAGA GCTGGTCCCC TGTCCCCTGG CAGAAGAGAA CGAGTTCATC 2 2 80 

CTGTACGCCA CGCGCAAGTC CATCCACCGC TATGACCTGG CTTCCGGAAC CACGGAGCAG 2 34 0 
TTGCCCCTCA CTGGGTTGCG GGCAGCAGTG GCCCTGGACT TTGACTATGA GCACAACTGC 24 00 

CTGTATTGGT CTGACCTGGC CTTGGACGTC ATCCAGCGCC TCTGTTTGAA CGGGAGTACA 24 60 
GGACAAGAGG TGATCATCAA CTCTGACCTG GAGACGGTAG AAGCTTTGGC TTTTGAACCC 2 520 

CTCAGCCAAT TACTTTACTG GGTGGACGCA GGCTTTAAAA AGATCGAGGT AGCCAATCCA 2 58 0 
GATGGTGACT TCCGACTCAC CGTCGTCAAT TCCTCGGTGC TGGATCGGCC CCGGGCCCTG 2 64 0 

GTCCTTGTGC CCCAAGAAGG GAT CATGTT C TGGACCGACT GGGGAGACCT GAAGCCTGGG 2700 
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ATTTATCGGA 


GCAACATGGA 


CGGATCTGCC 


GCCTATCCCC 


TCGTGICGGA 


GGATGTGAAG 


Z760 




TGGCCCAATG 


GCATTTCCGT 


GGACGATCAG 


TGGATCTACT 


gg; cggatgc 


CTACCTGGAC 


2820 


5 


TGCATTGAGC 


GCATCACGTT 


CAGCGGCCAG 


CAGCGCTCCG 


TCATCCTGGA 


CAGACTCCCG 


2880 




CACCCCTATG 


CCATTGCTGT 


CTTTAAGAAT 


GAGATTTACT 


GGGATGACTG 


GTCACAGCTC 


2940 




AGCATATTCC 


GAGCTTCTAA 


GTACAGCGGG 


TCCCAGATGG 


AGATTCTGGC 


CAGCCAGCTC 


3000 


T 0 


ACGGGGCTGA 


TGGACATGAA 


GATCTTCTAC 


AAGGGGAAGA 


ACACAGGAAG 


CAATGCGTGT 


3060 




GTACCCAGGC 


CGTGCAGCCT 


GCTGTGCCTG 


CCCAGAGCCA 


ACAACAGCAA 


AAGCTG CAGG 


3120 




TGTCCAGATG 


GCGTGGCCAG 


CAGTGTCCTC 


CCTTCCGGGG 


ACCTGATGTG 


TGACTGCCCT 


3180 


T5 


AAGGGCTACG 


AGCTGAAGAA 


CAACACGTGT 


GTCAAAGAAG 


AAGACACCTG 


TCTGCGCAAC 


3240 




CAGTACCGCT 


GCAGCAACGG 


GAACTGCATC 


AACAGCATCT 


GGTGGTGCGA 


TTTCGACAAC 


3300 




GACTGCGGAG 


ACATGAGCGA 


CGAGAAGAAC 


TGCCCTACCA 


CCATCTGCGA 


CCTGGACACC 


3360 


20 


CAGTTCC3TT 


GCCAGGAGTC 


TGGGACGTGC 


ATCCCGCTCT 


CCTACAAATG 


TGACCTCGAG 


3420 




GATGACTGTG 


GGGACAACAG 


TGACGAAAGG 


CACTGTGAAA 


TGCACCAGTG 


CCGGAGCGAC 


3480 




GAATACAACT 


GCAGCTCGGG 


CATGTGCATC 


CGCTCCTCCT 


GGGTGTGCGA 


Z GGGGA C AAC 


3540 


25 


GACTGCAGGG 


ACTGGT CCG A 


CGAGGCCAAC 


TGCACAGCCA 


TCTATCACAC 


CTGTGAGGCC 


3600 




TCCAACTTCC 


AGTGCCGCAA 


CGGGCACTGC 


ATCCCCCAGC 


GGTGGGCGTG 


TGACGGCGAC 


3660 




GCCGACTGCC 


AGGATGGCTC 


TGATGAGGAT 


CCAGCCAACT 


GTGAGAAGAA 


GTGCAAC3GC 


3^20 


30 


TTCCGCTGCC 


CGAACGGCAC 


CTGCATTCCC 


TCCACCAAGC 


ACTGTGACGG 


CCTGCACGAT 


3780 




TGCT CGGACG 


GCTCCGACGA 


GCAGCACTGC 


GAGCCCCTGT 


GTACACGGTT 


CATGGACTTC 


3840 




GTGTGTAAGA 


ACCGCCAGCA 


GTGCCTCTTC 


CACTCCATGG 


TGTGCGATGG 


3ATCATCCAG 


3 900 


35 


TGCCGTGACG 


GCTCCGACGA 


GGACCCAGCC 


TTTG CAGG AT 


GCTCCCGAGA 


CCCCGAGTTC 


3 960 




CACAAGGTGT 


GCGATGAGTT 


CGGCTTCCAG 


TGTCAGAACG 


GCGTGTGCAT 


CAGCTTGATC 


4020 




TGGAAGTGCG 


ACGGGATGGA 


TGACTGCGGG 


GACTACTCCG 


ACGAGGCCAA 


CTGTGAAAAC 


4 080 




CCCACAGAAG 


CCCCCAACTG 


CTCCCGCTAC 


TTCCAGTTCC 


GGTGTGACAA 


TGGCCACTGC 


4140 


40 


ATCCCCAACA 


GGTGGAAGTG 


TGACAGGGAG 


AATGACTGTG 


GGGACTGGTC 


CGACGAGAAG 


4200 




GACTGTGGAG 


ATT CACATGT 


ACTTCCGTCT 


ACGACTCCTG 


CACCCTCCAC 


GTGTCTGCCC 


4260 




AATTACTACC 


GCTGCGGCGG 


GGGGGCCTGC 


GTGATAGACA 


CGTGGGTTTG 


TGACGGGTAC 


4320 


45 


CGAGATTGCG 


CAGATGGATC 


CGACGAGGAA 


3CCTGCCCCT 


CGCTCCCCAA 


TGTCACTGCC 


4380 




ACCTCCTCCC 


CCTCCCAGCC 


TGGACGATGC 


GACCGATTTG 


AGTTTGAGTG 


CCACCAGCCA 


4440 




AAGAAGTGCA 


TCCCTAACTG 


GAGACGCTGT 


GACGGCCATC 


AGGATTGCCA 


GG AT GG C CAG 


4500 


50 


GACGAGGCCA 


ACTGCCCCAC 


TCACAGCACC 


TTGACCTGCA 


TGAGCTGGGA 


GTTCAAGTGT 


4560 




GAGGATGGCG 


AGGCCTGCAT 


CGTGCTGTCA 


GAACGCTGCG 


ACGGCTTCCT 


GGACTGCTCA 


4620 
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GATGAGAGCG ACGAGAAGGC CTGCAGTGAT GAGTTAACTG TATACAAAGT ACAGAAT CTT *680 
CAGTGGACAG CTGACTTCTC TGGGAATGTC XCTTTGACCT GGATGCGGCC CAAAAAAATG 474 0 
CCC7CTGCTG CTTGTGTATA CAACGTGTAC TATAGAGTTG TTGGAGAGAG O .TATGGAAG 4 800 
ACTCTGGAGA CTCACAGCAA TAAGACAAAC ACTGTATTAA AAGTGTTGAA ACCAGATACC 4 960 
ACCTACCAGG TTAAAGTGCA GGTTCAGTGC CTGAGCAAGG TGCACAACAC CAATGACTTT 4 92 0 

GTGACCTTGA GAACTCCAGA GGGATTGCCA GACGCCCCTC AGAACCTCCA GCTGTCGCTC 4 98 0 
CACGGGGAAG AGGAAGGTGT GATTGTGGGC CACTGGAGCC CTCCCACCCA CACCCACGGC 5 04 0 

CTCATTCGCG AATACATTGT AGAGTATAGC AGGAGTGGTT CCAAGGTGTG GACTTCAGAA 510 0 
AGGGCTGCTA GTAACTTTAC AGAAATAAAG AACTTGTTGG TCAACACCCT GTACACCGTC 516 0 

AGAGTGGCTG CGGTGACGAG TCGTGGGATA GGAAACTGGA GCGATTCCAA ATCCATTACC 5220 
ACCGTGAAAG GAAAAGCGAT CCC3CCACCA AATATCCACA TTGACAACTA CGATGAAAAT 5280 
TCCCTGAGTT TTACCCTGAC CGTGGATGGG AACATCAAGG TGAATGGCTA TGTGGTGAAC 534 0 

CTTTTCTGGG CATTTGACAC CCACAAACAA GAGAAGAAAA CCATGAACTT CCAAGGGAGC 54 0 0 

TCAGTGTCCC ACAAAGTTGG CAATCTGACA GCACAGACGG CCTATGAGAT TTCCGCCTGG 5460 
GCCAAGACTG ACTTGGGCGA TAGTCCTCTG TCATTTGAGC ATGTCACGAC CAGAGGGGTT 5520 
CGCCCACCTG CTCCTAGCCT CAAGGCCAGG GCTATCAATC AGACTGCAGT GGAATGCACC 5580 
TGGACAGGCC CCAGG AATGT GGTGTATGGC ATTTTCTATG CCACATCCTT CCTGGACCTC 
TACCGCAACC CAAGCAGCCT GACCACGCCG CTGCACAACG CAACCGTGCT CGTCGGTAAG 
GATGAGCAGT ATCTGTTTCT GGTCCGGGTG GTGATGCCCT ACCAAGGGCC GTCCTCGGAC 
TACGTGGTCG TGAAGATGAT CCGGGACAGC AGGCTTCCTC CCCGGCACCT GCATGCCGTT 
CACACCGGCA AGACCTCGGC CGTCAT CAAG TGGGAGTCGC CCTACGACTC TCCTGACCAG 5 8 80 
GACCTGTTCT ATGCGATCGC AG TT AAAG AT CTGATACGAA AGACGGACCG GAGCTACAAA 
GTCAAGTCCC GCAACAG C AC CGTGGAGTAC ACCCTGAGCA AGCTGGAGCC CGGAGGGAAA 
TACCACGTCA TTGTGCAGCT GGGGAACATG AGCAAAGATG CCAGTGTGAA GATCACCACC 
GTTTCGTTAT CGGCACCCGA TGCCTTAAAA ATCATAACAG AAAATGACCA CGTCCTTCTC 
TTCTGGAAAA GTCTAG CT CT AAAGGAAAAG TATTTTAACG AAAGCAGGGG CTACGAGATA 
CACATGTTTG ATAGCGCCAT GAATATCACC GCATACCTTG GGAATACTAC TGACAATTTC 
TTTAAAATTT CCAACCTGAA GATGGGTCAC AATTACACAT TCACGGTCCA GGCACGATGC 
CTTTTGGGCA GCC AG AT CTG CGGGGAGCCT GCCGTGCTAC TGTATGATGA GCTGGGGTCT 
GGTGGCGATG CGTCGGCGAT GCAGGCTGCC AGGTCTACTG ATGTCGCCGC CGTGGTGGTG 
CCCATCCTGT TT CTG AT ACT GCTGAGCCTG GGGGTCGGGT TTGCCATCCT GTACACGAAG 
CATCGGAGGC TGCAGAGCAG CTTCACCGCC TTCGCCAACA GCCACTACAG CTCCAGACTC 



5640 
5700 
5760 
5820 



5940 
6000 
6060 
6120 
6180 
6240 
6300 
6360 
6420 
6480 
6540 
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GGCTCCGCCA TCTTCTCCTC TGGGGATGAC TTGGGGGAGG ATGATGAAGA TGCTCCTATG h6 00 
ATCACTGGAT TTTCGGACGA CGTCCCCATG GTGATAGCC 66 3 9 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2213 amino acids 

(B) TYPE: amino acid 
;C) STRAND EDNESS : 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 2: 

Met Ala Thr Arg Ser Ser Arg Arg Glu Ser Arg Leu Pro Phe Leu Phe 

1 5 10 15 

2Q Thr Leu Val Ala Leu Leu Pro Pro Gly Ala Leu Cys Glu Val Trp Thr 

20 25 30 

Arg Thr Leu His Gly Gly Arg Ala Pro Leu Pro Gin Glu Arg Gly Phe 
35 40 45 

2Z Arg Val Val Gin Gly Asp Pro Arg Glu Leu Arg Leu Trp Glu Arg Gly 

50 55 60 

Asp Ala Arg Gly Ala Ser Arg Ala Asp Glu Lys Pro Leu Arg Axg Arg 

65 70 75 80 

Arg Ser Ala Ala Leu Gin Pro Glu Pro He Lys Val Tyr Gly Gin Val 
^ 85 93 95 

Ser Leu Asn Asp Ser His Asn Gin Met Val Val His Trp Ala Gly Glu 

1C0 105 110 

Lys Ser Asn Val lie Val Ala Leu Ala Arg Asp Ser Leu Ala Leu Ala 
115 120 125 

Arg Pro Arg Ser Ser Asp Val Tyr Val Ser Tyr Asp Tyr Gly Lys Ser 

130 135 140 

Phe Asn Lys lie Ser Glu Lys Leu Asn Phe Gly Ala Gly Asn Asn Thr 

4u 145 150 155 160 

Glu Ala Val Val Ala Gin Phe Tyr His Ser Pro Ala Asp Asn Lys Arg 

165 170 175 

Tyr He Phe Ala Asp Ala Tyr Ala Gin Tyr Leu Trp He Thr Phe Asp 

45 180 135 190 

Phe Cys Asn Thr He His Gly Phe Ser He Pro Phe Arg Ala Ala Asp 
195 200 205 

Leu Leu Leu His Ser Lys Ala Ser Asn Leu Leu Leu Gly Phe Asp Arg 

so 210 215 220 

Ser His Pro Asn Lys Gin Leu Trp Lys Ser Asp Asp Phe Gly Gin Thr 
225 230 235 240 
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Trp He Met He Gin Glu His Val oys Ser Phe Ser Tip Glv He *ftp 
245 250 " 255 

Pro Tyr Asp Lys Pro Asn Thr He Tyr He Glu Arg His Glu Pro Se:* 
260 265 270 

Gly Tyr Ser Thr Val Phe Arg Ser Thr Asp Phe Phe Gin Ser Arg Giu 

275 280 285 

Asn Gin Glu Val He Leu Glu Glu Val Arg Asp Phe Gin Leu Arg Asp 
290 295 300 

Lvs Tyr Met Phe Ala Thr Lys Val Val His Leu Leu Gly Ser Pro Leu 

205 310 315 320 

Gin Ser Ser Val Gin Leu Trp Val Ser Phe Gly Arg Lys Pro Met Arg 

325 330 335 

Ala Ala Gin Phe Val Thr Arg His Pro He Asn Glu Tyr Tyr He Ala 

345 350 



340 



Asp Ala Ser Glu Asp Gin Val Phe Val Cys val Ser His Ser Asn Asn 

355 360 365 

Arq Thr Asn Leu Tyr He Ser Glu Ala Glu Gly Leu Lys Phe Ser Leu 

370 375 380 

Ser Leu Glu Asn Val Leu Tyr Tyr Thr Pro Gly Gly Ala Gly Ser Asp 

385 390 395 400 

Thr Leu Val Arg Tyr Phe Ala Asn Glu Pro Phe Ala Asp Phe His Arg 
405 410 415 

Val Glu Gly Leu Gin Gly Val Tyr He Ala Thr Leu He Asn Gly Ser 
420 425 430 

Met Asn Glu Glu Asn Met Arg Ser Val He Thr Phe Asp Lys Gly Gly 
435 440 445 

Th- ~rp Glu Phe Leu Gin Ala Pro Ala Phe Thr Gly Tyr Gly Glu Lys 
450 455 460 

He Asn Cys Glu Leu Ser Glu Gly Cys Ser Leu His Leu Ala Gin Arg 
465 470 475 4dU 

Leu Ser Gin Leu Leu Asn Leu Gin Leu Arg Arg Met Pro He Leu Ser 
485 490 

Lys Glu Ser Ala Pro Gly Leu He He Ala Thr Gly Ser Val Gly Lys 
500 505 510 

Asn Leu Ala Ser Lys Thr Asn Val Tyr He Ser Ser Ser Ala Gly Ala 

515 520 525 

Arg Trp Arg Glu Ala Leu Pro Gly Pro His Tyr Tyr Thr Trp Gly Asp 

530 535 
His Gly Gly He He Met Ala He Ala Gin Gly Met Glu Thr Asn Glu 
545 550 555 

Leu Lys Tyr Ser Thr Asn Glu Gly Glu Thr Trp Lys Ala Phe Thr Phe 
56 5 
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Ser Glu Lys Pro Val Phe Val Tyr 31/ Leu Leu T*r Glu ?ro Gly Glu 
580 585 590 

Lys Ser Thr Val Phe Thr He Phe Gly Ser Asn Lys Glu Asn Val His 

595 600 605 

Ser Trp Leu He Leu Gin Val Asn Ala Thr Asp Ala Leu Gly Val Pro 

610 615 620 

Cys Thr Glu Asn Asp Tyr Lys Leu Trp Ser Pro Ser Asp Glu Arg Gly 

625 630 635 640 

Asn Glu Cys Leu Leu Gly His Lys Thr Val Phe Lys Arg Arg Thr Pro 
645 650 655 

His Ala Thr Cys Phe Asn Gly Glu Asp Phe Asp Arg Pro Val Val Val 

660 665 670 

Ser Asn Cys Ser Cys Thr Arg Glu Asp Tyr Glu Cys Asp Phe Gly Phe 
675 680 635 

Arg Met: Ser Glu Asp Leu Ala Leu Glu Val Cys Val Pro Asp Pro Gly 

690 695 700 

Phe Ser Gly Lys Ser Ser Pro Pro Val Pro Cys Pro Val Gly Ser Thr 

705 710 715 720 

Tyr Arg Arg Ser Arg Gly Tyr Arg Lys He Ser Gly Asp Thr Cys Ser 

725 730 735 

Gly Gly Asp Val Glu Ala Arg Leu Glu Gly Glu Leu Val Pro Cys Pro 
740 745 750 

Leu Ala Glu Glu Asn Glu Phe lie Leu Tyr Ala Thr Arg Lys Ser He 
755 760 765 

His Arg Tyr Asp Leu Ala Ser Gly Thr Thr Glu Gin Leu Pro Leu Thr 

770 775 780 

Gly Leu Arg Ala Ala Val Ala Leu Asp Phe Asp Tyr Glu His Asn Gys 
735 790 ""95 300 

Leu Tyr Trp Ser Asp Leu Ala Leu Asp Val He Gin Arg Leu Cys Leu 

805 310 315 

Asn Gly Ser Thr Gly Gin Glu Val He lie Asn Ser Asp Leu Glu Thr 
820 325 830 

Val Glu Ala Leu Ala Phe Glu Pro Leu Ser Gin Leu Leu Tyr Trp Val 
335 840 845 

Asp Ala Gly Phe Lys Lys lie Glu Val Ala Asn Pro Asp Gly Asp Phe 
850 855 860 

Arg Leu Thr Val Val Asn Ser Ser Val Leu Asp Arg Pro Arg Ala Leu 
365 870 875 880 

Val Leu Val Pro Gin Glu Gly He Met Phe Trp Thr Asp Trp Gly Asp 
385 890 895 

Leu Lys Pro Gly He Tyr Arg Ser Asn Met Asp Gly Ser Ala Ala Tyr 
900 905 910 

Arg Leu Val Ser Glu Asp Val Lys Trp Pro Asn Gly He Ser Val Asp 
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915 920 925 

Asp Gin Trp lie Tyr Trp Thr Asp Ala Tyr Leu Asp Cys He Glu Arg 
930 935 940 

He Thr Phe Ser Gly Gin Gin Arg Ser Val He Leu Asp Arg Leu Pro 

945 950 955 960 

His Pro Tyr Ala He Ala Val Phe Lys Asn Glu He Tyr Trp Asp Asp 
965 970 975 

Trp Ser Gin Leu Ser He Phe Arg Ala Ser Lys Tyr Ser Gly Ser Gin 
980 985 990 

Met Glu lie Leu Ala Ser Gin Leu Thr Gly Leu Met Asp Met Lys He 
995 1000 1005 

Phe Tyr Lys Gly Lys Asn Thr Gly Ser Asn Ala Cys Val Pro Arg Pro 
1010 1015 1020 

Cys Ser Leu Leu Cys Leu Pro Arg Ala Asn Asn Ser Lys Ser Cys Arg 
1025 1030 1035 1040 

Cys Pro Asp Gly Val Ala Ser Ser Val Leu Pro Ser Gly Asp Leu Met 
1045 1050 1055 

Cys Asp ^vs Pro Lys Gly Tyr Glu Leu Lys Asn Asn Thr Cys Val Lys 
1060 1065 1070 

Glu Glu Asp Thr Cys L Arg Asn Gin Tyr Arg Cys Ser Asn Gly Asn 
1075 10Q0 1085 

r ys lie Asn Ser lie Trp Trp Cys Asp Phe Asp Asn Asp Cys Gly Asp 
1390 1095 1100 

Met Ser Asp Glu Lys Asn Cys Pro Thr Thr He Cys Asp Leu Asp Thr 
1105 1110 H15 1120 

Gin Phe Arg Cys Gin Glu Ser Gly Thr Cys He Pro Leu Ser Tyr Lys 
1125 H30 H35 

Cvs Asp Leu Glu Asp Asp Cys Gly Asp Asn Ser Asp Glu Arg His Cys 
1140 H45 H50 

G i u Met His Gin Cys Arc Ser Asp Glu Tyr Asn Cys Ser Ser Gly Met 
1155 H60 1155 

Cvs He Arg Ser Ser Trp Val Cys Asp Gly Asp Asn Asp Cys Arg Asp 
1170 H75 H80 

Trp Ser Asp Glu Ala Asn Cys Thr Ala He Tyr His Thr Cys Glu Ala 
1185 H50 H95 1200 

Ser Asn Phe Gin Cys Arg Asn Gly His Cys He Pro Gin Arg Trp Ala 
1205 1210 1*15 

Cys Asp Gly Asp Ala Asp Cys Gin Asp Gly Ser Asp Glu Asp Pro Ala 
1220 1225 1230 

Asn Cvs Glu Lys Lys Cys Asn Gly Phe Arg Cys Pro Asn Gly Thr Cys 
12 35 1240 I 245 

lie P-o Ser Thr Lys His Cys Asp Gly Leu His Asp Cyo Ser Asp Gly 
1250 1255 1260 
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Ser Asp Glu Gin His Cys Glu Pro Leu Cys Thr Arg Ph3 Met Asp Plie 
1265 1270 1275 1280 

Val Cys Lys Asn Arg Gin Gin Cys Leu Phe His Ser Met Val Cys Asp 
1235 1290 1295 

Gly He He Gin Cys Arg Asp Gly Ser Asp Glu Asp Pro Ala Phe Ala 
^3j0 1305 1310 

Gly Cys Ser Arg Asp Pro Glu Phe His Lys Val Cys Asp Glu Phe Gly 
1315 1320 1325 

Phe Gin Cys Gin Asn Gly Val Cys lie Ser Leu He Trp Lys Cys Asp 
-» 130 1335 1340 

Glv Met Asp Asp Cys Gly Asp Tyr Ser Asp Glu Ala Asn Cys Glu Asn 
1345 1350 1355 1360 

Fro Thr Glu Ala Pro Asn Cys Ser Arg Tyr Phe Gin Phe Arg Cvs Asp 
1355 1370 1375 

Asr. Gly His Cys He Pro Asn Arg Trp Lys Cys Asp Arg Glu Asn Asp 
138: 1385 1390 

Cys Gly Asp Trp Ser Asp Glu Lys Asp Cys Gly Asp Ser His Val Leu 
1395 1400 1405 

Fro Ser Thr Thr Pro Ala Pro Ser Thr Cys Leu Pro Asn Tyr Tyr Arg 
1410 1415 1420 

Cys Gly Gly Gly Ala Cys Val He Asp Thr Trp Val Cys Asp Gly Tyr 
;;25 1430 1435 1440 

Arg Asp Cvs Ala Asd Gly Ser Asp Glu Glu Ala Cya Pro Ser Leu Pro 
14 4 5 1450 1455 

Asn Val Thr Ala Thr Ser Ser Pro Ser Gin Pro Gly Arg Cys Asp Arg 
1460 1465 1470 

Phe Glu Phe Glu Cys His Gin Pro Lys Lys Cys He Pre Asn Trp Arg 
1475 148J 1485 

Arg Cys Asp Gly His Gin Asp Cys Gin Asp Gly Gin Asp Glu Ala Asn 
1490 1495 1500 

Cvs Pro Thr His Ser Thr Leu Thr Cvs Met Ser Trp Glu Phe Lys Cys 
1505 1510 1515 1520 

Glu Asp Gly Glu Ala Cys He Val Leu Ser Glu Arg Cys Asp Gly Phe 
1525 1530 1535 

Leu Asp Cvs Ser Asp Glu Ser Asp Glu Lys Ala Cys Ser Asp Glu Leu 
1540 1545 1550 

Thr Val Tyr Lys Val Gin Asn Leu Gin Trp Thr Ala Asp Phe Ser Gly 
1555 1560 1565 

Asn Val Thr Leu Thr Trp Met Arg Pro Lys Lys Met Pro Ser Ala Ala 
1570 1575 1580 

Cvs Val Tyr Asn Val Tyr Tyr Arg Val Val Gly Glu Ser He Trp Lys 
1585 1590 1595 1600 
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Thr Leu Glu Thr His Ser Asn Lys Thr Asn Thr V*l L«u Lys Val Leu 
1605 1^10 1615 

Lys Pro Asp Thr Thr Tyr Gin Val Lys Val Gin Val Gin Cys Leu ier 
5 1620 1625 1630 

Lys Val Hi3 Asn Thr Asn Asp Phe Val Thr Leu Arg Thr Pro Glu Gly 
1535 1640 1645 

Leu Pro Asp Ala Pro Gin Asn Leu Gin Leu Ser Leu His Gly Glu Glu 
W 1650 1655 1660 

Glu Gly Val He Val Gly His Trp Ser Pro Pro Thr His Thr His Gly 
1665 1670 1675 168C 

Leu He Arg Glu Tyr He Val Glu Tyr Ser Arg Ser Gly Ser Lys Val 
15 1685 1690 1595 

Trp Thr Ser Glu Arg Ala Ala Ser Asn Phe Thr Glu He Lys Asn Leu 
1700 1705 1710 

Leu Val Asn Thr Leu Tyr Thr Val Arg Val Ala Ala Val Thr Ser Arg 
1715 1720 1725 

2C 

Gly He Gly Asn Trp Ser Asp Ser Lys Ser lie Thr Thr Val Lys Gly 
1730 1735 1740 

Lys Ala He Pro Pro Pro Asn He His He Asp Asn Tyr Asp Glu Asn 
: 745 1750 1755 17 6 0 

25 

Lpu Ser Phe Thr Leu Thr Val Asp Gly Asn He Lys Val Asn Gly 
1765 1770 1775 

-y- Val Val Asn Leu Phe Trp Ala Phe Asp Thr His Lys Gin Glu Lys 
17 80 1785 1790 

t ys - hr Met Asn phe Gin Gly Ser Ser Val Ser His Lys Val Gly Asn 
1795 1800 1305 

Le- ~hr Ala Gin Thr Ala Tyr Glu He Ser Ala Trp Ala Lys Thr Asp 
1310 1815 1820 

25 Le - Gly Asp Ser Pro Leu Ser Phe Glu His Val Thr Thr Arg Gly Val 

: a25 1830 1835 184C 

A-3 Pro Pro Ala Pro Ser Leu Lys Ala Arg Ala He Asn Gin Thr Ala 
184 5 1850 1355 

*c Val Glu 0 Thr Trp Thr Gly p r o Arg Asn Val Val Tyr Gly He Phe 

I860 1965 1870 

Tvr Ala Thr Ser Phe Leu Asp Leu Tyr Arg Asn Pro Ser Ser Leu Thr 
* r i37 5 1880 1885 

45 - hr pro Leu His Asn Ala Thr Val Leu Val Gly Lys Asp Glu Gin Tyr 

** 1890 1895 1900 

Leu Phe Leu Val Arg Val Val Met Pro Tyr Gin Gly Pro Ser Ser Asp 
1905 1910 1915 

c 0 Tyr Val Val Val Lys Met He Pro Asp Ser Arg Leu Pro Pro Arg His 

1 1925 1930 1935 

Leu His Ala Val His Thr Gly Lys Thr Ser Ala Val lie Lys Trp Glu 
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1940 194: 



1950 



25 



Ser Pro Tyr Asp Ser Pro Asp Gin Asp Leu Phe Tyr Ala He Ala Val 
1955 i960 1965 - 

Lys Asp Leu He Arg Lys Thr Asp Arg Ser Tyr Lys Val Lys Ser Arg 
1970 1975 1930 

Asn Ser Thr Val Glu Tyr Thr Leu Ser Lys Leu Glu Pro Gly Gly L.y S 
-985 1990 1995 2000 

Tyr His Val He Val Gin Leu Gly Asn Met Ser Lys Asp Ala Ser Val 
2005 2010 2015 

Lys He Thr Thr Val Ser Leu Ser Ala Pro Asp Ala Leu Lys He He 
2020 2025 2030 

Thr Glu Asn Asp His Val Leu Leu Phe Trp Lys Ser Leu Ala Leu Lys 
2035 2040 2045 

Glu Lys Tyr Phe Asn Glu Ser Arg Gly Tyr Glu He His Met Phe Asp 
2050 2055 2060 

Ser Ala Met Asn He Thr Ala Tyr Leu Gly Asn Thr Thr Asp Asn Phe 
2065 2070 2075 2080 

Phe Lys Tie Ser Asn Leu Lys Met Gly His Asn Tyr Thr Phe Thr Val 
2085 2C90 2095 

Gin Ala Arg Cys Leu Leu Gly Ser Gin He Cys Gly Glu Pro Ala Val 
2100 2105 2110 

Leu Leu Tyr Asp Glu Leu Gly Ser Gly Gly Asp Ala Ser Ala Met Gin 
2115 2120 2125 

Ala Ala Arg Ser Thr Asp Val Ala Ala Val Val Val Pre He Leu Phe 
2133 2135 2140 

Leu He Leu Leu Ser Leu Gly Val Gly Phe Ala lie Leu Tyr Thr Lys 
2145 2150 2155 2150 

His Arg Arg Leu Gin Ser Ser Phe Thr Ala Phe Ala Asn Ser His Tyr 
2165 2170 2175 

Ser Ser Arg Leu Gly Ser Ala He Phe Ser Ser Gly Asp Asp Leu Gly 
2180 2185 2190 

Glu Asp Asp Glu Asp Ala Pro Met He Thr Gly Phe Ser Asp Asp Val 
21*95 " 2200 2205 

Pro Met Val He Ala 

2210 

!2) INFORMATION FOR SEQ ID NO: 3: 

ti) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6961 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
(D> TOPOLOGY: linear 

fn) MOLECULE TYPE: cDNA to mRNA 
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(ix) FEATURE: 

(A) NAME / KEY : sig peptide 

(B) LOCATION: 178 . .261 

(ix) FEATURE: 

(A) NAME /KEY : mat peptide 
(B; LOCATION: 262 . .6816 



(xi> 



SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

mriCCTGAC GGCGCCGCGC CGCGCCGCGC CGCGCCGAGC GGGACCCAGC 6 0 

gill slE ssssss sasss sssss ssssss ss 

!K S SS S S SS S SS S S SS SS 5 2 IS S ' 221 
SS S3 2°. 2! SS S SS SS S SS 55 SS S5 IS S SS 27 

s si SS sss ss ss sis ™ SS ss SS 5 s is ss 32 



S3 S3 SS S SS S S S5 SS SS 25 IS SS S S5 SI 3 " 

- ss ss ss ss ss ss ss ss sss ss ss ss ss ss ss ss 420 

70 ™ r nn* rar. r,Tr AGC 4 68 



ss ss sss ss ss ss sts ss tis ss ss ss ss as ss ss 

SS S 25 IS SS SS SS SIS SIS 53 SS IS SSS SS SS £ 

SS i SIS vIS sss ss sis ss ss - SS SSS SS sss ss 



ss ss ss ss ss si? ss - s Tyt sii^s??^ is is 612 

i£ SS ss IS ss S ss its IS SS SSS SS "I "S S SS «" 

£ S3 51? SS HI IS SS SS SS IS 252 SS SS Si SS SS ™ 

16 5 ?:7° ~™ ttt r,ar ttc 7 56 



SS is ss si SSS SS sss SS SI SS IS SS fff is ss is 
SS SSS SS SS SI sss is is SS sss SS SS SS SS SI SS 

SS SS SS ss ss sss is SS SI SS ss sss us ss ss iff 

210 21^ ^r- »ir to- PAT GAT TTT GGC CAG ACC TGG 

CAC CCC AAC AAG CAG CTG TGG AAG TCG GAT GAT TTT ^ 
His Pro Asn Lys Gin Leu Trp Lya Ser Aap Asp rae v»iy ^ 

SIS S3 SS SS i SS S3 SS IS IS IS IS SS SS SS SSS 



245 250 
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10 



25 



35 



TAT 


GAC 


AAA 


CCA 


AAC 


ACC 


ATC 


TAC 


ATC 


C-AA 


CGG 


CAC 


GAA 




TCT 


GGC 


996 


Tyr 


Asp 


Lys 


Pro 


Asn 


Thr 


He 


Tyr 


He 


C-lu 


Arg 


Ki3 


Glu 


Pro 


Ser 


Gly 








260 










265 










270 








TAC 


TCC 


ACG 


GTT 


TTC 


CGA 


AGT 


ACA 


GAC 


TTC 


TTC 


CAG 


TCC 


CGG 


GAA 


AAC 


1344 


Tyr 


Ser 


Thr 


Val 


Phe 


Arg 


Ser 


Thr 


Asp 


Phe 


Phe 


Gin 


Ser 


Arg 


Glu 


Asn 




275 










280 










285 










CAG 


GAA 


GTG 


ATC 


TTG 


GAG 


GAA 


GTG 


AGA 


GAC 


TTT 


CAG 


CTT 


CGG 


GAC 


AAG 


1092 


Gin 


Glu 


Val 


lie 


Leu 


Glu 


Glu 


Val 


Arg 


Asp 


Phe 


Gin 


Leu 


Arg 


Asp 


Lys 




290 










295 










300 










305 




TAC 


ATG 


TTT 


GCT 


ACA 


AAG 


GTG 


GTG 


CAT 


CTC 


TTG 


GGC 


AGT 


CCA 


CTG 


CAG 


114 0 


Tyr 


Met 


Phe 


Ala 


Thr 


Lys 


Val 


Val 


His 


Leu 


Leu 


Gly 


Ser 


Pro 


Leu 


Gin 










310 










315 










320 






TCT 


TCT 


GTC 


CAG 


CTC 


TGG 


GTC 


TCC 


TTT 


GGC 


CGG 


AAG 


CCC 


ATG 


CGG 


GCC 


1188 


Ser 


Ser 


Val 


Gin 


Leu 


Trp 


Val 


Ser 


Phe 


Gly 


Arg 


Lys 


Pro 


Met 


Arq 


Ala 










325 










330 










335 






GCC 


CAG 


TTT 


GTT 


ACA 


AGA 


CAT 


CCT 


ATC 


AAC 


GAA 


TAT 


TAC 


ATC 


GCG 


GAT 


12 3 6 


Ala 


Gin 


Phe 


Val 


Thr 


Arg 


His 


Pro 


He 


Asn 


Glu 


Tyr 


Tyr 


He 


Ala 


Asp 








340 










345 










350 








G' 


TCG 


GAG 


GAC 


CAG 


GTG 


TTT 


GTG 


TGT 


GTC 


AGT 


CAC 


AGC 


AAC 


AAC 


CGC 


1284 


Ala 


Ser 

355 


Glu 


Asp 


Gin 


Val 


Phe 

360 


Val 


Cys 


Val 


Ser 


His 
365 


Ser 


Asn 


Asn 


Arg 




ACC 


AAC 


CTC 


TAC 


ATC 


TCG 


GAG 


GCA 


GAG 


GGC 


TTG 


AAG 


TTC 


TCT 


CTG 


TCC 


133 2 


Thr 


Asn 


Leu 


Tyr 


lie 


Ser 


Glu 


Ala 


Glu 


Gly 


Leu 


Lys 


Phe 


Ser 


Leu 


Ser 




3-0 










375 










380 










385 




CTG 


GAG 


AAC 


GTG 


CTC 


TAC 


TAC 


ACC 


CCG 


GGA 


GGG 


GCC 


GGC 


AGT 


GAC 


ACC 


1380 


Leu 


Glu 


Asn 


Val 


Leu 

390 


Tyr 


Tyr 


Thr 


Pro 


Gly 
395 


Gly 


Ala 


Gly 


Ser 


Asp 
400 


Thr 




TTG 


GTG 


AGG 


TAC 




GCA 


AAT 


GAA 


CCG 


TTT 


GCT 


GAC 


TTC 


CAT 


CGT 


GTG 


1423 


Leu 


Val 


Arg 


Tyr 
405 


Phe 


Ala 


Asn 


Glu 


Pro 
410 


Phe 


Ala 


Asp 


Phe 


His 
415 


Arg 


Val 




CAA 


CGC 


TTG 


CAG 


GGA 


GTC 


TAC 


ATT 


GCT 


ACT 


CTG 


ATT 


AAT 


GGT 


TCT 


ATG 


1476 


Glu 


Gly 


Leu 
420 


Gin 


Gly 


Val 


Tyr 


He 
425 


Ala 


Thr 


Leu 


He 


Asn 

430 


Gly 


Ser 


Met 




AAT 


GAG 


3AG 


AAC 


ATG 


AGA 


TCT 


GTC 


ATC 


ACC 


TTT 


GAC 


AAA 


GGG 


GGC 


ACC 


1524 


Asn 


Glu 


Glu 


Asn 


Met 


Arg 


Ser 


Val 


He 


Thr 


Phe 


Asp 


Lys 


Gly 


Gly 


Thr 






4 35 










440 








445 












TGG 


GAA 




CTG 


CAG 


GCT 


CCA 


GCC 


TTC 


ACG 


GGG 


TAT 


GGA 


GAG 


AAA 


ATC 


1572 


Trp 


31 u 


Phe 


Leu 


Gin 


Ala 


Pro 


Ala 


Phe 


Thr 


Gly 


Tyr 


Gly 


Glu 


Lys 


:ie 




45 0 










455 










460 










465 




AAC 


TGT 


GAG 


CTG 


TCC 


GAG 


GGC 


TGT 


TCC 


CTC 


CAC 


CTG 


GCC 


CAG 


ZGC 


CTC 


1623 


Asn 


Cys 


Glu 


Leu 


Ser 
470 


Glu 


Gly 


Cys 


Ser 


Leu 
475 


His 


Leu 


Ala 


Gin 


Arg 
430 


Leu 




AJC 




TTG 


CTC 


AAC 


CTC 


CAG 


CTC 


CGG 


AGG 


ATG 


CCC 


ATC 


CTG 


TCC 


AAG 


1663 


Ser 


Gin 


Leu 


Leu 

485 


Asn 


Leu 


Gin 


Leu 


Arg 

490 


Arg 


Met 


Pro 


He 


Leu 
495 


Ser 


Lys 




GAG 


TCC 


GCG 


CCT 


GGC 


CTC 


ATC 


ATT 


GCC 


ACG 


GGC 


TCA 


GTG 


GGA 


AAG 


AAC 


1716 


Glu 


Ser 


Ala 

500 


Pro 


Gly 


Leu 


He 


He 

505 


Ala 


Thr 


Gly 


Ser 


Val 

510 


Gly 


Lys 


Asn 




™~ 


GCT 


AGC 


AAG 


ACA 


AAC 


GTG 


TAC 


ATC 


TCT 


AGC 


AGT 


GCT 


GGA 


GCC 


AGG 


1764 


Leu 


Ala 


Ser 


Lys 


Thr 


Asn 


Val 


Tyr 


He 


Ser 


Ser 


Ser 


Ala 


Gly 


Ala 


Arg 






515 








520 










525 












TGG 


CGA 


GAG 


GCA 


CTT 


CCT 


GGA 


CCT 


CAC 


TAC 


TAT 


ACA 


TGG 


GGA 


GAC 


CAT 


1812 


Trp 


Arg 


Glu 


Ala 


Leu 


Pro 


Gly 


Pro 


His 


Tyr 


Tyr 


Thr 


Trp 


Gly 


Asp 


His 




530 










535 










540 










545 




GGC 


GGC 


ATC 


ATC 


ATG 


GCC 


ATT 


GCC 


CAA 


GGC 


ATG 


GAA 


ACC 


AAC 


GAA 


CTG 


1860 


Giy 


Gly 


He 


lie 


Met 


Ala 


He 


Ala 


Gin 


Gly 


Met 


Glu 


Thr 


Asn 


Glu 


Leu 










550 










555 










560 






AAG 


TAC 


AGT 


ACC 


AAC 


GAA 


GGG 


GAG 


ACC 


TGG 


AAA 


GCC 


TTC 


ACC 


TTC 


TCT 


1903 


Lys 


Tyr 


Ser 


Thr 


Asn 


Glu 


Gly 


Glu 


Thr 


Trp 


Lys 


Ala 


Phe 


Thr 


Phe 


Ser 








565 








570 










575 








GAG 


AAG 


CCC 


GTG 


TTT 


GTG 


TAT 


GGG 


CTC 


CTC 


ACG 


GAA 


CCC 


GGC 


GAG 


AAG 


1956 


Glu 


Lys 


Pro 


Val 


Phe 


Val 


Tyr 


Gly 


Leu 


Leu 


Thr 


Glu 


Pro 


Gly 


Glu 


Lys 






580 








585 










590 










AGC 


AC3 


GTC 


TTC 


ACC 


ATC 


TTT 


GGC 


TCC 


AAC 


AAG 


GAG 


AAC 


GTG 


CAC 


AGC 


2004 
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Ser Thr Val Phe Thr lie Phe Gly Ser Asr. Lys Glu Asn Val His Ser 

595 600 605 

TGG CTC ATC CTC CAG GTC AAT GCC ACA GAC GCC CTG GGG GTT CCT TGC 20 52 
Trp Leu He Leu Gin Val Asn Ala Thr Asp Ala Leu Gly Val Pro Cys 
610 615 620 '625 

ACA GAG AAC GAC TAC AAG CTC TGG TCA CCA TCT GAT GAG CGG GGG AAT 2100 
Thr Glu Asn Asp Tyr Lys Leu Trp Ser Pro ser Asp ulu Arg Gly Asn 

630 635 640 

GAG TGT TTG CTT GGA CAC AAG ACT GTT TTC AAA CGG AGG ACC CCG CAC 2148 
Glu Cys Leu Leu Gly His Lys Thr Val Phe Lys Arg Arg Thr Pro His 

645 650 655 

GCC ACA TGC TTT AAC GGA GAA GAC TTT GAC AGG CCG GTG GTT GTG TCC 2196 
Ala Thr Cys Phe Asn Gly Glu Asp Phe Asp Arg Pro val val Val Ser 

660 665 670 

AAC TGC TCC TGC ACC CGG GAG GAC TAT GAG TGT GAC TTT GGC TTC CGG 2244 
Asn Cys Ser Cys Thr Arg Glu Asp Tyr Glu Cys Asp Phe Gly Phe Arg 

675 680 685 

A-G AGT GAA GAC TTG GCA TTA GAG GTG TGT GTT CCA GAT CCA GGA TTT 22 92 
Met Ser Glu Asp Leu Ala Leu Glu Val Cys Val Pro Asp Pro Gly Phe 
690 695 700 7 05 

TCT GGA AAG TCC TCC CCT CCA GTG CCT TGT CCC GTG GGC TCT ACG TAC 23 4 0 
Ser Glv Lys Ser Ser Pro Pro Val Pro Cys Pro Val Gly Ser Thr Tyr 
n 0 710 715 720 

AGG CGA TCA AGA GGC TAC CGG AAG ATT TCT GGG GAC ACC TGT AGT GGA 23 88 
Arg Arg Ser Arg Gly Tyr Arg Lys lie Ser Gly Asp Thr Cys Ser Gly 

725 730 735 

GGA GAT GTT GAG GCA CGG CTA GAA GGA GAG CTG GTC CCC TGT CCC CTG 24 3 6 
GW Asp Val Glu Ala Arg Leu Glu Gly Glu Leu Val Pro Cys Pro Leu 
740 745 750 

25 GCA GAA GAG AAC GAG TTC ATC CTG TAC GCC ACG CGC AAG TCC ATC CAC 24 84 

Ala Glu Glu Asn Glu Phe He Leu Tyr Ala Thr Arg Lys Ser He His 

755 760 765 

CGC TAT GAC CTG GCT TCC GGA ACC ACG GAG CAG TTG CCC CTC ACT GGG 2532 



15 



8 7 0 - • - 

CTT GTG CCC CAA GAA GGG ATC ATG TTC TGG ACC GAC TGG GGA GAC CTG 
Leu Val Pro Gin Glu Gly lie Met Phe Trp Thr Asp Trp Gly Asp ^eu 

385 890 895 

AAG CCT GGG ATT TAT CGG AGC AAC ATG GAC GGA TCT GCC GCC TAT CGC 
Lys Pro Gly lie Tyr Arg Ser Asn Met Asp Gly Ser Ala Ala Tyr Arg 

CTC GTG TCG GAG GAT GTG AAG TGG CCC AAT GGC ATT TCC GTG GAC GAT 2964 
Leu Val Ser Glu Asp Val Lys Trp Pro Asn Gly lie Ser Val Asp Asp 

CAG TGG ATC TAC TGG ACG 11? GCC TAC CTG GAC TGC ATT GAG CGC ATC 3012 
Gin Trp lie Tyr Trp Thr Asp Ala Tyr Leu Asp Cys lie Glu Arg lie 



2530 



Ara Tvr Asp Leu Ala Ser Gly Thr Thr Glu Gin Leu Pro Leu Thr Gly 

7 70 775 780 785 

T~G CGG GCA GCA GTG GCC CTG GAC TTT GAC TAT GAG CAC AAC TGC CTG 

Leu Arg Ala Ala Val Ala Leu Asp Phe Asp Tyr Glu His Asn Cys Leu 

-AT TGG TCT GAC CTG GCC TTG GAC GTC ATC CAG CGC CTC TGT TTG AAC 262 3 
Tvr Trp Ser Asp Leu Ala Leu Asp Val lie Gin Arg Leu Cys Leu Asn 

805 810 815 

GGG AGT ACA GGA CAA GAG GTG ATC ATC AAC TCT GAC CTG GAG ACG GTA 26 76 
G'v Ser Thr Gly Gin Glu Val He He Asn Ser Asp Leu Glu Thr /a* 

820 825 830 

GAA GCT TTG GCT TTT GAA CCC CTC AGC CAA TTA CTT TAC TGG GTG GAC 2 724 
Glu Ala Leu Ala Phe Glu Pro Leu Ser Gin Leu Leu Tyr Trp Val Asp 

GCA GGC TTT AAA AAG ATC GAG GTA GCC AAT CCA GAT GGT GAC TTC CGA 2772 
Ala Gly Phe Lys Lys He Glu Val Ala Asn Pro Asp Gly Asp Phe Arg 

C^C ACC GTC GTC AAT TCC TCG GTG CTG GAT CGG CCC CGG GCC CTG GTC 
Leu Thr Val Val Asn Ser Ser Val Leu Asp Arg Pro Arg Ala Leu /a. 

975 980 



1820 
2868 
2916 
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10 



15 



25 



30 



930 










935 










*4 0 










345 




ACG 


TTC 


AGC 


GGC 


CAG 


CAG 


CGC 


TCC 


GTC 


Arc 


k TG 




ACA 


CTC 


CCG 


CAC 


11060 


Thr 


Phe 


Ser 


Gly 


Gin 


Gin 


Arg 


Ser 


Val 


He 


Leu 


Asp 


Arg 


Leu 


Pro 


His 












9 SO 










955 








960 






CCC 


TAT 


GCC 


ATT 


GCT 


GTC 


TTT 


AAG 


AAT 


GAO 


ATT 


TAC 


TGG 


GAT 


GAC 


TGG 


3108 


Pro 


Tyr 


Ala 


He 

965 


Ala 


Val 


Phe 


Lys 


Asn 

970 


Glu 


He 


Tyr 


Trp 


Asp 

975 


Asp 


Trp 




TCA 


CAG 


CTC 


AGC 


ATA 


TTC 


CGA 


GCT 


TCT 


AAG 


TAC 


AGC 


GGG 


TCC 


CAG 


ATG 


3156 


Ser 


Gin 


Leu 

980 


Ser 


He 


Phe 


Arg 


Ala 

98S 


Ser 


Lys 


Tyr 


Ser 


Gly 
990 


Ser 


Gin 


Met 




GAG 


ATT 


CTG 


GCC 


AGC 


CAG 


CTC 


ACG 


GGG 


CTG 


ATG 


GAC 


ATG 


AAG 


ATC 


TTC 


3204 


Glu 


He 


Leu 


Ala 


Ser 


Gin 


Leu 


Thr 


Gly 


Leu 


Met 


Asp 


Met 


Lys 


He 


Phe 






995 










1000 








1005 










TAC 


AAG 


GGG 


AAG 


AAC 


ACA 


GGA 


AGC 


AAT 


GCG 


TGT 


GTA 


CCC 


AGG 


CCG 


TGC 


3252 


Tyr 


Lys 


Gly 


Lys 


Asn 


Thr 


Gly 


Ser 


Asn 


Ala 


Cys 


Val 


Pro 


Arg 


Pro 


Cys 




1010 








1015 








1020 








1025 




AGC 


CTG 


CTG 


TGC 


CTG 


CCC 


AGA 


GCC 


AAC 


AAC 


AGC 


AAA 


AGC 


TGC 


AGG 


TGT 


330 3 


Ser 


Leu 


Leu 


Cys 


Leu 


Pro 


Arg 


Ala 


Asn 


Asn 


Ser 


Lys 


Ser 


Cys 


Arg 


Cys 












1030 








1035 








1040 




CCA 


GAT 


GGC 


GTG 


GCC 


AGC 


AGT 


GTC 


CTC 


CCT 


TCC 


GGG 


GAC 


CTG 


ATG 


TGT 


3343 


Pro 


Asp 


Gly 


Val 


Ala 


Ser 


Ser 


Val 


Leu 


Pro 


Ser 


Gly Asp 


Leu 


Met 


Cys 










1045 








1050 








1055 






GAC 


TGC 




AAG 


GGC 


TAC 


GAG 


CTG 


AAG 


AAC 


AAC 


ACG 


TGT 


GTC 


AAA 


GAA 


3396 


Asp 


Cys 


Pro 


Lys 


Gly 


Tyr 


Glu 


Leu 


Lys 


Asn 


Asn 


Thr 


Cys 


Val 


Lys 


Glu 








1060 








1065 








1070 








GAA 


GAC 


ACC 


TGT 


CTG 


CGC 


AAC 


CAG 


TAC 


CGC 


TGC 


AGC 


AAC 


GGG 


AAC 


TGC 


3444 


Glu 


Asp 


Thr 


Cys 


Leu 


Arg 


Asn 


Gin 


Tyr 


Arg 


Cys 


Ser 


Asn 


Gly Asn 


Cys 






1075 








1080 








1085 










ATC 


AAC 


AGC 


ATC 


TGG 


TGG 


TGC 


GAT 


TTC 


GAC 


AAC 


GAC 


TGC 


GGA 


GAC 


ATG 


3492 


He 


Asn 


Ser 


He 


Trp 


Trp 


Cys 


Asp 


Phe 


Asp 


Asn 


Asp 


Cys 


Gly Asp 


Met 




1090 








1095 








1100 








1105 




AGC 


GAC 


GAG 


AAG 


AAC 


TGC 


CCT 


ACC 


ACC 


ATC 


TGC 


GAC 


CTG 


GAC 


ACC 


CAG 


3540 


Ser 


Asp 


Glu 


Lys 


Asn 


Cys 


Pro 


Thr 


Thr 


He 


Cys 


Asp 


Leu 


Asp 


Thr 


Gin 












1110 








1115 








1120 




TTC 


CGT 


TGC 


CAG 


GAG 


TCT 


GGG 


ACG 


TGC 


ATC 


CCG 


CTC 


TCC 


TAC 


AAA 


TGT 


3583 


?he 


Arg 


Cys 


Gin 


Glu 


Ser 


Gly 


Thr 


Cys 


He 


Pro 


Leu 


Ser 


Tyr 


Lys 


Cys 










1125 








1130 








1135 






GAC 


CT^"" 


GAG 


GAT 


GAC 


TGT 


GGG 


GAC 


AAC 


AGT 


GAC 


GAA 


AGG 


CAC 


TGT 


GAA 


3635 


Asp 


Leu 


Glu 


Asp 


Asp 


Cys 


Gly Asp 


Asn 


Ser 


Asp 


Glu 


Arg 


His 


Cys 


Glu 








1140 








1145 








1150 








ATG 


CAC 


CAG 


TGC 


CGG 


AGC 


GAC 


GAA 


TAC 


AAC 


TGC 


AGC 


TCG 


GGC 


ATG 


TGC 


36S4 


Met 


His 


Gin 


Cys 


Arg 


Ser 


Asp 


Glu 


Tyr 


Asn 


Cys 


Ser 


Ser 


Gly 


Met 


Cys 






1155 








1160 








1165 










ATC 


CGC 


TCC 


TCC 


TGG 


GTG 


TGC 


GAC 


3GG 


GAC 


AAC 


GAC 


TGC 


AGG 


GAC 


TGG 


3732 


lie 


Arg 


Ser 


Ser 


Trp 


Val 


Cys 


Asp 


Gly Asp 


Asn 


Asp 


Cys 


Arg 


Asp 


Trp 




117C 


> 

j 








1175 








1180 








1185 






3AC 


GAG 


GCC 


AAC 


TGC 


ACA 


GCC 


ATC 


TAT 


CAC 


ACC 


TGT 


GAG 


GCC 


TCC 


3780 


Ser 


Asp 


Glu 


Ala 


Asn 


C/s 


Thr 


Ala 


He 


Tyr 


His 


Thr 


Cys 


Glu 


Ala 


Ser 












1190 








1195 








1200 




AAC 


TTC 


CAG 


TGC 


CGC 


AAC 


GGG 


CAC 


TGC 


ATC 


CCC 


CAG 


CGG 


TGG 


GCG 


TGT 


3823 


Asn 


Phe 


Gin 


Cys 


Arg 


Asn 


Gly 


His 


Cys 


lie 


Pro 


Gin 


Arg 


Trp 


Ala 


Cys 










12C5 








1210 








1215 






GAC 


GGC 


GAC 


GCC 


GAC 


TGC 


CAG 


GAT 


GGC 


TCT 


GAT 


GAG 


GAT 


CCA 


GCC 


AAC 


3875 


Asp 


Gly 


Asp 


Ala 


Asp 


Cys 


Gin 


Asp 


Gly 


Ser 


Asp 


Glu 


Asp 


Pro 


Ala 


Asn 








1220 








1225 








1230 








TGT 


GAG 


AAG 


AAG 


TGC 


AAC 


GGC 


TTC 


CGC 


TGC 


CCG 


AAC 


GGC 


ACC 


TGC 


ATT 


3924 


Cys 


Glu 


Lys 


Lys 


Cys 


Asn 


Gly 


Phe 


Arg 


Cys 


Pro 


Asn 


Gly 


Thr 


Cys 


lie 






1235 








1240 








1245 












TCC 


ACC 


AAG 


CAC 


TGT 


GAC 


GGC 


CTG 


CAC 


GAT 


TGC 


TCG 


GAC 


GGC 


TCC 


3972 


Pro 


Ser 


Thr 


Lys 


His 


Cys 


Asp Gly 


Leu 


His 


Asp 


Cys 


Ser 


Asp 


Gly 


Ser 




1250 








1255 








1260 








1265 




GAC 


GAG 


CAG 


CAC 


TGC 


GAG 


CCC 


CTG 


TGT 


ACA 


CGG 


TTC 


ATG 


GAC 


TTC 


GTG 


4020 


Asp 


Glu 


Gin 


His 


Cys 


Glu 


Pro 


Leu 


Cys 


Thr 


Arg 


Phe 


Met 


Asp 


Phe 


Val 





1270 1275 1280 
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TGT AAG AAC CGC CAG CAG TGC CTC TTC C AC TCC ATG CTG TGC C^AT GGG 4063 
Cys Lys Asn Arg Gin Gin Cys Leu Phe his Ser Met V»l Cys Asp Gly 

1285 1290 1295 

ATC ATC CAG TGC CGT GAC GGC TCC GAC GAG GAC CCA GCC TTT GCA GGA 4116 
5 lie lie Gin Cys Arg Asp Gly Ser Asp Glu Asp Pro Ala Phe Ala Gly 

1300 1305 1310 

TGC TCC CGA GAC CCC GAG TTC CAC AAG GTG TGC GAT GAG TTC GGC TTC 416 4 
Cys Ser Arg Asp Pro Glu Phe His Lys Val Cys Asp Glu Phe Gly Phe 

1315 1320 1325 

CAG TGT CAG AAC GGC GTG TGC ATC AGC TTG ATC TGG AAG TGC GAC GGG 4 212 
io Qi n cys Gin Asn Gly Val Cys He Ser Leu lie Trp Lys Cys Asp Gly 

1330 1335 1340 1345 

ATG GAT GAC TGC GGG GAC TAC TCC GAC GAG GCC AAC TGT GAA AAC CCC 42 6 0 
Met Asp Asp Cys Gly Asp Tyr Ser Asp Glu Ala Asn Cys Glu Asn Pro 

1350 1355 1360 

ACA GAA GCC CCC AAC TGC TCC CGC TAC TTC CAG TTC CGG TGT GAC AAT 4 30 8 
rs Thr Glu Ala Pro Asn Cys Ser Arg Tyr Phe Gin Phe Arg Cys Asp Asn 

1365 1370 1375 

GGC CAC TGC ATC CCC AAC AGG TGG AAG TGT GAC AGG GAG AAT GAC TGT 4 3 56 
Gly His Cys He Pro Asn Arg Trp Lys Cys Asp Arg Glu Asn Asp Cys 

1380 1385 1390 

GGG GAC TGG TCC GAC GAG AAG GAC TGT GGA GAT TCA CAT GTA CTT CCG 4 4 04 
Gly Asp Trp Ser Asp Glu Lys Asp Cys Gly Asp Ser His Val Leu Pro 
t0 1395 1400 1405 

TCT ACG ACT CCT GCA CCC TCC ACG TGT CTG CCC AAT TAC TAC CGC TGC 44 52 
Ser Thr Thr Pro Ala Pro Ser Thr Cys Leu Pro Asn Tyr Tyr Arg Cys 
1410 1415 1420 1425 

GGC GGG GGG GCC TGC GTG ATA GAC ACG TGG GTT TGT GAC GGG TAC CGA 4 500 
Glv Gly Gly Ala Cys Val He Asp Thr Trp Val Cys Asp Gly Tyr Arg 
25 1430 1435 1440 

GAT TGC GCA GAT GGA TCC GAC GAG GAA GCC TGC CCC TCG CTC CCC AAT 4 54 8 
Asp Cys Ala Asp Gly Ser Asp Glu Glu Ala Cys Pro Ser Leu Pro Asn 

1445 1450 1455 

GTC ACT GCC ACC TCC TCC CCC TCC CAG CCT GGA CGA TGC GAC CGA TTT 4 5 96 
Val Thr Ala Thr Ser Ser Pro Ser Gin Pro Gly Arg Cys Asp Arg Phe 
20 1460 1465 1470 

GAG TTT GAG TGC CAC CAG CCA AAG AAG TGC ATC CCT AAC TGG AGA CGC 4 64 4 
Glu Phe Glu Cys His Gin Pro Lys Lys Cys He Pro Asn Trp Arg Arg 

1775 1480 1485 

TGT GAC GGC CAT CAG GAT TGC CAG GAT GGC CAG GAC GAG GCC AAC TGC 4 6 52 
Cys Asp Gly His Gin Asp Cys Gin Asp Gly Gin Asp Glu Ala Asn Cys 
, 5 1490 1495 1500 1505 

CCC ACT CAC AGC ACC TTG ACC TGC ATG AGC TGG GAG TTC AAG TGT GAG 4 74 0 
Pro Thr His Ser Thr Leu Thr Cys Met Ser Trp Glu Phe Lys Cys Glu 

1510 1515 1520 

GAT GGC GAG GCC TGC ATC GTG CTG TCA GAA CGC TGC GAC GGC TTC CTG 4 73 8 
Asp Gly Glu Ala Cys He Val Leu Ser Glu Arg Cys Asp Gly Phe Leu 
1525 1530 1535 

40 GAC TGC TCA GAT GAG AGC GAC GAG AAG GCC TGC AGT GAT GAG TTA ACT 4 83 6 

Asp Cys Ser Asp Glu Ser Asp Glu Lys Ala Cys Ser Asp Glu Leu Thr 

1540 1545 1550 

GTA TAC AAA GTA CAG AAT CTT CAG TGG ACA GCT GAC TTC TCT GGG AAT 4 884 
Val Tyr Lys Val Gin Asn Leu Gin Trp Thr Ala Asp Phe Ser Gly Asn 
1555 1560 1565 

45 GTC ACT TTG ACC TGG ATG CGG CCC AAA AAA ATG CCC TCT GCT GCT TGT 4 93 2 

Val Thr Leu Thr Trp Met Arg Pro Lys Lys Met Pro Ser Ala Ala Cys 
1570 1575 1580 1585 

GTA TAC AAC GTG TAC TAT AGA GTT GTT GGA GAG AGC ATA TGG AAG ACT 4 98 0 
Val Tvr Asn Val Tyr Tyr Arg Val Val Gly Glu Ser He Trp Lys Thr 
* 1590 1595 1600 

50 CTG GAG ACT CAC AGC AAT AAG ACA AAC ACT GTA TTA AAA GTG TTG AAA 5028 

Leu Glu Thr His Ser Asn Lys Thr Asn Thr Val Leu Lys Val Leu Lys 

1605 1610 1615 

CCA GAT ACC ACC TAC CAG GTT AAA GTG CAG GTT CAG TGC CTG AGC AAG 5076 
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25 



Pro 


ABp 


Thr 


Thr 


a y a 


Gin 


Val 


Lys 


Val 


\s in 


Val 


Gin 




I-er. 


S**r 


Lys 
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1625 








1630 






GTG 


CAC 


AAC 


ACC 


AAT 


GAC 


TTT 


GTG 


Iff 


TTG 


AGA 


ACT 


CCA 




GGA 


TTG 


5124 


Val 


His 


Asn 


Thr 


Asn 


Asp 


Phe 


Val 


Thr 


Leu 


Arg 


Thr 


Pro 


Glu 


Gly. 


Leu 






1635 








1640 








1645 








CCA 


GAC 


GCC 


CCT 


CAG 


AAC 


CTC 


CAG 


CTG 


TCG 


CTC 


CAC 


GGG 


GAA 


GAG 


GAA 


5172 


Pro 


Asp 


Ala 


Pro 


Gin 


Asn 


Leu 


Gin 


Leu 


Ser 


Leu 


His 


Gly 


Glu 


Glu 


Glu 




1650 








1655 








1660 








1665 




GGT 


GTG 


ATT 


GTG 


GGC 


CAC 


TGG 


AGC 


CCT 


CCC 


ACC 


CAC 


ACC 


CAC 


GGC 


CTC 


5 2 2 0 


Gly 


Val 


He 


Val 


Gly 


His 


Trp 


Ser 


Pro 


Pro 


Thr 


His 


Thr 


His 


Gly 


Leu 










1670 








1675 








1680 




ATT 


CGC 


GAA 


TAC 


ATT 


GTA 


GAG 


TAT 


AGC 


AGG 


AGT 


GGT 


TCC 


AAG 


GTG 


TGG 


5 2 6 8 


lie 


Arg 


Glu 


Tyr 


lie 


Val 


Glu 


Tyr 


Ser 


Arg 


Ser 


Gly 


Ser 


Lys 


Val 


Trp 










1685 








1690 








1695 






ACT 


TCA 


GAA 


AGG 


GCT 


GCT 


AGT 


AAC 


TTT 


ACA 


GAA 


ATA 


AAG 


AAC 


TTG 


TTG 


cue 
j J lb 


Thr 


Ser 


Glu 


Arg 


Ala 


Ala 


Ser 


Asn 


Phe 


Thr 


Glu 


He 


Lys 


Asn 


Leu 


Leu 








1700 








1705 








1710 








GTC 


AAC 


ACC 


CTG 


TAC 


ACC 


GTC 


AGA 


GTG 


GCT 


GCG 


GTG 


ACG 


AGT 


CGT 


GGG 


CI C A 


Val 


Asn 


Thr 


Leu 


Tyr 


Thr 


Val 


Arg 


Val 


Ala 


Ala 


Val 


Thr 


Ser 


Arg 


Gly 






1715 








1720 








1725 










ATA 


GGA 


AAC 


TGG 


AGC 


GAT 


TCC 


AAA 


TCC 


ATT 


ACC 


ACC 


GTG 


AAA 


GGA 


AAA 




Tie 


3iy 


Asn 


Trp 


Ser 


Asp 


Ser 


Lys 


Ser 


lie 


Thr 


Thr 


Val 


Lys 


Gly 


Lys 




1730 








1^35 








1740 








1745 




GCG 


ATC 


CCG 


CCA 


CCA 


AAT 


ATC 


CAC 


ATT 


GAC 


AAC 


TAC 


GAT 


GAA 


AAT 


TCC 




Ala 


lie 


Pro 


Fro 


Pro 


Asn 


He 


His 


He 


Asp 


Asn 


Tyr 


ASp 


Glu 


Asn 


Ser 












1750 








1755 








1760 




CTG 


AGT 


TTT 


ACC 


CTG 


ACC 


GTG 


GAT 


GGG 


AAC 


ATC 


AAG 


GTG 


AAT 


GGC 


TAT 


jjUO 


Leu 


Ser 


Phe 


Thr 


Leu 


Thr 


Val 


Asp 


Gly Asn 


He 


Lys 


Val 


Asn 


Gly 


Tyr 










1765 








1770 








1775 






GTG 


GTG 


AAC 


CTT 


TTC 


TGG 


GCA 


TTT 


GAC 


ACC 


CAC 


AAA 


CAA 


GAG 


AAG 


AAA 


^556 


Val 


Val 


Asn 


Leu 


Phe 


Trp 


Ala 


Phe 


Asp 


Thr 


His 


Lys 


Gin 


Glu 


Lys 


Lys 








1780 








1785 








1790 








ACC 


ATG 


AAC 


TTC 


CAA 


GGG 


AGC 


TCA 


GTG 


TCC 


CAC 


AAA 


GTT 


GGC 


AAT 


CTG 


56 04 


Thr 


Me: 


Asn 


Phe 


Gin 


Gly 


Ser 


Ser 


Val 


Ser 


His 


Lys 


Val 


Gly 


Asn 


Leu 






1795 








1800 








1805 










ACA 


GCA 


CAG 


ACG 


GCC 


TAT 


GAG 


ATT 


TCC 


GCC 


TGG 


GCC 


AAG 


ACT 


GAC 


TTG 


5 6 5 2 


Thr 


Ala 


Gin 


Thr 


Ala 


Tyr 


Glu 


He 


Ser 


Ala 


Trp 


Ala 


Lys 


Thr 


Asp 


Leu 




1810 








1815 








1320 








1825 






3AT 


AGT 


CCT 


CTG 


TCA 


TTT 


GAG 


CAT 


GTC 


ACG 


ACC 


AGA 


GGG 


GTT 


CGC 


5 7 C 0 


sly 


Asp 


Ser 


Pro 


Leu 


Ser 


Phe 


Glu 


His 


Val 


Thr 


Thr 


Arg 


Gly Val 


Arg 












1330 








1835 








1840 




CCA 




GCT 


CCT 


AGC 


CTC 


AAG 


GCC 


AGG 


GCT 


ATC 


AAT 


CAG 


ACT 


GCA 


GTG 


574 8 


Pro 


Pro 


Ala 


Fro 


Ser 


Leu 


Lys 


Ala 


Arg 


Ala 


He 


Asn 


Gin 


Thr 


Ala 


Val 










1845 






1850 








1855 






GAA 


TGC 


ACC 


TGG 


ACA 


GGC 


CCC 


AGG 


AAT 


GTG 


GTG 


TAT 


GGC 


ATT 




TAT 


5796 


Glu 


Cys 


Thr 


Trp 


Thr 


Gly 


Pro 


Arg 


Asn 


Val 


Val 


Tyr 


Gly 


He 


Phe 


Tyr 








I860 








1865 








1870 








GCC 


ACA 


TCC 


TTC 


CTG 


GAC 


CTC 


TAC 


CGC 


AAC 


CCA 


AGC 


AGC 


CTG 


ACC 


ACG 


5 84 4 


Ala 


Thr 


Ser 


Fhe 


Leu 


Asp 


Leu 


Tyr 


Arg 


Asn 


Pro 


Ser 


Ser 


Leu 


Thr 


Thr 






1875 








1880 








1885 










CCG 


CTG 


CAC 


AAC 


GCA 


ACC 


GTG 


CTC 


GTC 


GGT 


AAG 


GAT 


GAG 


CAG 


TAT 


CTG 


5 8 92 


Pro 


Leu 


His 


Asn 


Ala 


Thr 


Val 


Leu 


Val 


Gly 


Lys 


Asp 


Glu 


Gin 


Tyr 


Leu 




1890 








1895 








1900 








1905 




TTT 


CTG 


GTC 


CGG 


GTG 


GTG 


ATG 


CCC 


TAC 


CAA 


GGG 


CCG 


TCC 


TCG 


GAC 


TAC 


5 94 0 


?he 


Leu 


Val 


Arg 


Val 


Val 


Met 


Pro 


Tyr 


Gin 


Gly 


Pro 


Ser 


Ser 


Asp 


Tyr 










1910 








1915 








1920 




GTG 


GTC 


GTG 


AAG 


ATG 


ATC 


CCG 


GAC 


AGC 


AGG 


CTT 


CCT 


CCC 


CGG 


CAC 


CTG 


5988 


Val 


Val 


Val 


Lys 


Met 


lie 


Pro 


Asp 


Ser 


Arg 


Leu 


Pro 


Pro 


Arg 


HIS 


Leu 










1925 








1930 








1935 






CAT 


GCC 


GTT 


CAC 


ACC 


GGC 


AAG 


ACC 


TCG 


GCC 


GTC 


ATC 


AAG 


TGG 


GAG 


TCG 


6036 


His 


Ala 


Val 


His 


Thr 


Gly 


Lys 


Thr 


Ser 


Ala 


Val 


He 


Lys 


Trp 


Glu 


Ser 








1940 






1945 








1950 








CCC 


TAC 


GAC 


TCT 


CCT 


GAC 


CAG 


GAC 


CTG 


TTC 


TAT 


GCG 


ATC 


GCA 


GTT 


AAA 


6084 


Pro 


Tyr 


Asp 


Ser 


Pro 


Asp 


Gin 


Asp 


Leu 


Phe 


Tyr 


Ala 


He 


Ala 


val 


Lys 





84 
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19S5 1960 I- 6 * 

GAT CTG ATA CGA AAG ACG GAC CGG AGC 1\C AAA ~TC AAG TCC CGC AAC 6132 
Aso Leu He Arg Lys Thr Asp Arg Ser Tyr Lys Val Lys Ser Arg Asn 
ia ; 0 19"S 1980 -1985 

5 AGC ACC GTG GAG TAC ACC CTG AGC AAG CTG GAG CCC GGA GGG AAA TAC 6180 

Sar Thr Val Glu Tvr Thr Leu Ser Lys Leu Glu Pro Gly Gly Lys Tyr 

1990 1995 2000 

CAC GTC ATT GTG CAG CTG GGG AAC ATG AGC AAA GAT GCC AGT GTG AAG 6 228 
His Val He Val Gin Leu Gly Asn Met Ser Lys Asp Ala Ser Val Lys 
2005 2010 2015 

to ATC ACC ACC QTT 7CG TTA TCG GGA CCC GAT GCC TTA AAA ATC ATA ACA 6 2 76 

He Thr Thr Val Ser Leu Ser Ala Pro Asp Ala Leu Lys He He Thr 

2020 2025 2030 

GAA AAT GAC CAC GTC CTT CTC TTC TGG AAA AGT CTA GCT CTA AAG GAA 6 3 24 
Glu Asn Asp His Val Leu Leu Phe Trp Lys Ser Leu Ala Leu Lys Glu 
2035 2040 2045 

is AAG TAT TTT AAC GAA AGC AGG GGC TAC GAG ATA CAC ATG TTT GAT AGC 6372 

Lvs Tvr Phe Asn Glu Ser Arg Gly Tyr Glu He His Met Phe Asp Ser 
2050 2055 2060 2065 

GCC ATG AAT ATC ACC GCA TAC CTT GGG AAT ACT ACT GAC AAT TTC TTT 6420 
Ala Met Asn Ile Thr Ala Tyr Leu Gly Asn Thr Thr Asp Asn Phe Phe 
2070 2075 2080 

20 AAA ATT TCC AAC CTG AAG ATG GGT CAC AAT TAC ACA TTC ACG GTC CAG 64 63 

Lvs He Ser Asn Leu Lys Met Gly His Asn Tyr Thr Phe Thr Val Gin 

y 2085 2090 2095 

GCA CGA TGC CTT TTG GGC AGC CAG ATC TGC GGG GAG CCT GCC GTG CTA 6 516 
Ala Arq Cys Leu Leu Gly Ser Gin He Cys Gly Glu Pro Ala Val Leu 

2100 2105 2110 

CTG TAT GAT GAG CTG GGG TCT GGT GGC GAT GCG TCG GCG ATG CAG GCT 6 5 64 
26 Leu Tyr Asp Glu Leu Gly Ser Gly Gly Asp Ala Ser Ala Met Gin Ala 

2115 2120 2125 

GCC AGG TCT ACT GAT GTC GCC GCC GTG GTG GTG CCC ATC CTG TTT CTG 6 612 
Ala Arq Ser Thr Asp Val Ala Ala Val Val Val Pro He Leu Phe Leu 
2130 2135 2140 2145 

ATA CTG CTG AGC CTG GGG GTC GGG TTT GCC ATC CTG TAC ACG AAG CAT 6 6 60 
20 Ile Leu Leu Ser Leu Gly Val Gly Phe Ala He Leu Tyr Thr Lys His 

2150 2155 2160 

CGG AGG CTG CAG AGC AGC TTC ACC GCC TTC GCC AAC AGC CAC TAC AGC 6 7 08 
Arq Arq Leu Gin Ser Ser Phe Thr Ala Phe Ala Asn Ser His Tyr Ser 

2165 2170 2175 

-CC AGA CTC GGC TCC GCC ATC TTC TCC TCT GGG GAT GAC TTG GGG GAG 6 7 56 
Ser Arq Leu Gly Ser Ala Ile Phe Ser Ser Gly Asp Asp Leu Gly Glu 

2180 2185 2190 

GAT GAT GAA GAT GCT CCT ATG ATC ACT GGA TTT TCG GAC GAC GTC CCC 6 8 04 
Asp Asp Glu Asp Ala Pro Met Ile Thr Gly Phe Ser Asp Asp Val Pro 

2195 2200 2205 

ATG GTG ATA GCC TGAAAGAGCT TTCCTCACTA GAAACCAAAT GGTGTAAATA 6 8 56 

40 Met Val He Ala 

TTTTATTTGA TAAAGATAGT TGATGGTTTA TTTTAAAAGA TGCACTTTGA GTTGCAATAT 6 916 
GTTATTTTTA TATGGGCCAA AAACAAAAGC AAAAAAAAAA AAAAA 6ybl 

(2) INFORMATION FOR SEQ ID NO: 4: 

45 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 300 base pairs 
(B> TYPE: nucleic acid 

( C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 

50 MOLECULE TYPE: cDNA to mRNA 
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10 



25 



55 



{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

ATATCCACAT TGACAGCTAT GGTGAAAATT ATCTAAGCIT CACCCTGACC £TGGAGAGTG 60 

ATATCXAGGT GAATGGCTAT GTGOTGAACC TTTTCTGGGC ATTTGACACC CACAAGCAAG 12 0 

AGAGGAGAAC TTTGAACTTC CGAGGAAGCA TATTGTCACA CAAAGTTGGC AATCTGACAG 19 0 

CT CAT A CATC CTATGAGATT TCTGC CTGGG CCAAGACTGA CTTGGGGGAT AGCCCTCTGG 24 0 

CATTTGAGCA TGTTATGACC AGAGGGGTTC GCCCACCTGC ACCTAGCCTC AAGGCCAAAG 300 

(2) INFORMATION FOR SEQ ID NO: 5: 

ii) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 6642 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

MOLECULE TYPE: cDNA to mRNA 



'.xi J SEQUENCE DESCRIPTION: SEQ ID NO : 5: 

ATGGCGACAC GGAGCAGCAG GAGGGAGTCG CGACTCCCGT TCCTATTCAC CCTGGTCGCA 6 0 

CTGCTGCCGC CZGGAG CT CT CTG CGAAGT C TGGACGCAGA GG CTGCACGG CGGCAGCGCG 12 0 

CCCTTGCCCC A3GACCGGGG CTTCCTCGTG GTGCAGGGCG ACCCGCGCGA GCTGCGGCTG 1B0 

TGGGCGCGCG GGGATGCCAG GGGGGCGAGC CGCGCGGACG AGAAGCCGCT C C GGAGG AAA 24 0 

CGGAGCGCTG CCCTGCAGCC CGAGCCCATC AAGGTGTACG GACAGGTTAG TCTGAATGAT 300 

TCCCACAATC AGATGGTGGT GCACTGGGCT GGAGAGAAAA GCAACGTGAT CGTGGCCTTG 360 

GCCCGAGATA GCCTGGCATT GGCGAGGCCC AAGAGCAGTG ATGTGTACGT GTCTTACGAC 42 0 

TATGGAAAAT CATTCAAGAA AATTTCAGAC AAGTTAAACT TTGGCTTGGG AAATAGGAGT 48 0 

GAAGCTGTTA TTGCCCAGTT CTACCACAGC CCTGCGGACA ACAAGCGGTA CATCTTTGCA 54 0 

GACGCTTATG CCCAGTACCT CTGGATCACG TTTGACTTCT GCAACACTCT TCAAGGCTTT 600 

TCCATCCCAT TTCGGGCAGC TGATCTCCTC CTACACAGTA AGGCCTCCAA CCTTCTCTTG 660 

GGC TTTGACA GGTCCCACCC CAACAAGCAG CTGTGGAAGT CAGATGACTT TGGCCAGACC 72 0 

TGGATCATGA TTCAGGAACA TGTCAAGTCC TTTTCTT GGG GAATTGATCC CTATGACAAA 78 0 

CCAAATACCA TCTACATTGA ACGACACGAA CCCTCTGGCT ACTCCACTGT CTTCCGAAGT 84 0 

ACAGATTTCT TCCAGTCCCG GG AAAACCAG GAAGTGATCC TTGAGGAAGT GAGAGATTTT 90 0 

CAGCTTCGGG ACAAGTACAT GTTTG CT AC A AAGGTGGTGC ATCTCTTGGG CAGTGAACAG 96 0 

CAGTCTTCTG TCCAGCTCTG GGTCTCCTTT GGCCGGAAGC CCATGAGAGC AGCCCAGTTT 1020 

GTCACAAGAC AT C CT ATT AA TGAATATTAC ATCGCAGATG CCTCCGAGGA CCAGGTGTTT 1080 
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w 



15 



20 



25 



GTGTGTGTCA GCCACAGTAA CAACCGCACC AATTTATACA TCTCAGA^SQC X3AC4GGGCTG 114 0 

AAGTTCTCCC TGTCCTTGGA GAACGTGCTC TATTACAGCC CAGGAGGGGC CGGCAGTGAC 12 00 

ACCTTGGTGA GGTATTTTGC AAATGAACCA TITGCTGACT TCCACCGAGT GGAAGGATTG 126 0 

CAAGGAGTCT ACATTGCTAC TCTGATTAAT GGTTCTATGA ATGAGGAGAA CATGAGATCG 132 0 

GTCATCACCT TTGACAAAGG GGGAACCTGG GAGTTTCTTC AGGCTCCAGC CTTCACGGGA 13 8 0 

TATGGAGAGA AAATCAATTG TGAGCTTTCC CAGGGCTGTT CCCTTCATCT GGCTCAGCGC 144 0 

CTCAGTCAGC TCCTCAACCT CCAGCTCCGG AGAATGCCCA TCCTGTCCAA GGAGTCGGCT 15 00 

CCAGGCCTCA TCATCGCCAC TGGCTCAGTG GGAAAGAACT TGGCTAGCAA GACAAACGTG 1560 

TACATCTCTA GCAGTGCTGG AGCCAGGTGG CGAGAGGCAC TTCCTGGACC TC ACT ACT AC 16 20 

ACATGGGGAG ACCACGGCGG AATCATCACG GCCATTGCCC AGGGCATGGA AACCAACGAG 16 8 0 

CTAAAATACA GTACCAATGA AGGGGAGACC TGGAAAACAT TCATCTTCTC TGAGAAGCCA 174 0 
GTGTTTGTGT ATGGCCTCCT CACAGAACCT GGGGAGAAGA GCACTGTCTT CACCATCTTT 



1800 



GGCTCGAACA AAGAGAATGT CCACAGCTGG CTGATCCTCC AGGTCAATGC CACGGATGCC 18 60 

TTGGGAGTTC C CTGCACAG A GAATGACTAC AAGCTGTGGT CACCATCTGA TGAGCGGGGG 



1920 



400 
460 



AATGAGTGTT TGCTGGGACA CAAGACTGTT TTCAAACGGC GGACCCCCCA TGCCACATGC 19 8 0 

TTCAATGGAG AGGACTTTGA CAGGCCGGTG GTCGTGTCCA AC7GCTCCTG CACCCGGGAG 2 04 0 

GACTATGAGT GTGACTTCGG TTTCAAGATG AGTGAAGATT TGTCATTAGA GGTTTGTGTT 2100 

CCAGATCCGG AATTTTCTGG AAAGT CAT AC TCCCCTCCTG TGCCTTGCCC TGTGGGTTCT 216 0 

ACTTACAGGA GAACGAGAGG CTACCGGAAG ATTTCTGGGG ACACTTGTAG CGGAGGAGAT 222 0 

GTTGAAGCGC GACTGGAAGG AGAGCTGGTC CCCTGTCCCC TGGCAGAAGA GAACGAGTTC 228 0 

ATTCTGTATG CTGTGAGGAA ATCCATCTAC CGCTATGACC TGGCCTCGGG AGCCACCGAG 2 34 0 
CAGTTGCCTC TCACCGGGCT ACGGGCAGCA GTGGCCCTGG ACTTTGACTA TGAGCACAAC 
TGTTTGTATT GGTCCGACCT GGCCTTGGAC GTCATCCAGC GCCTCTGTTT GAATGGAAGC 

ACAGGGCAAG AGGTGATCAT CAATTCTGGC CTGGAGACAG TAGAAGCTTT GGCTTTTGAA 2 52 0 

CCCCTCAGCC AGCTGCTTTA CTGGG TAG AT GCAGGCTTCA AAAAGATTGA GGTAGCTAAT 2580 

CCAGATGGCG ACTTCCGACT CACAATCGTC AATTCCTCTG TGCTTGATCG TCCCAGGGCT 2640 

CTGGTCCTCG TGCCCCAAGA GGGGGTGATG TTCTGGACAG ACTGGGGAGA CCTGAAGCCT 2700 

GGGATTTATC GGAGCAATAT GGATGGTTCT GCTGCCTATC ACCTGGTGTC TGAGGATGTG 2 76 0 

AAGTGGCCCA ATGGCATCTC TGTGGACGAC CAGTGGATTT ACTGGACGGA TGCCTACCTG 2 820 

GAGTGCATAG AGCGGATCAC GTTCAGTGGC CAGCAGCGCT CTGTCATTCT GGACAACCTC 2 8 80 

CCGCACCCCT ATGCCATTGC TGTCTTTAAG AATGAAATCT ACTGGGATGA CTGGTCACAG 2940 
CTCAGCATAT TCCGAGCTTC CAAATACAGT GGGTCCCAGA TGGAGATTCT GGCAAACCAG 



3000 



55 
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5 



10 



20 



25 



CTCACGGGGC 


TCATGGACAT 


GAAGATTTTC 


TACAAfiGGGA 


AnAACACTGG 


AAOCAATGCC 


3C60 


TGTGTGCCCA 


GGCCA TGCAG 


CCTGCTGTGC 


CTGCCCAAGG 


CCAACAACAG 


TAGAAGCTGC 


3120 


AGGTGTCCAG 


AGGATGTGTC 


CAGCAGTGTG 


CTTCCATCAG 


GGGACCTGAT 


GTGTGACTGC 


3180 


CCTCAGGGCT 


ATCAGCTCAA 


GAACAATACC 


TGTGTCAAAG 


AAGAGAACAC 


CTGTCTTCGC 


3240 


AACCAGTATC 


GCTGCAGCAA 


CGGGAACTGT 


AT CAACAGCA 


TTTGGTGGTG 


TGACTTTGAC 


3300 


AACGACTGTG 


GAGACATGAG 


CGATGAGAGA 


AACTGCCCTA 


CCACCATCTG 


TGACCTGGAC 


3360 


AC C CAGTTT C 


GTTGCCAGGA 


GTCTGGGACT 


TGTATCCCAC 


TGTCCTATAA 


ATGTGACCTT 


3420 


GAGGATGACT 


GTGGAGACAA 


CAGTGATGAA 


AGTCATTGTG 


AAATGCACCA 


GTGCCGGAGT 


34 8 0 


GACGAGTAGA 


ACT3CAGTTC 


CGGCATGTGC 


ATCCGCTCCT 


CCTGGGTATG 


TGACGGGGAC 


3540 


AACGACTGCA 


GGGACTGGTC 


TGATGAAGCC 


AACTGTACCG 


CCATCTATCA 


CACCTGTGAG 


3600 


GCCTCCAACT 


TCCAGTGCCG 


AAACGGG CAC 


TGCATCCCCC 


AGCGGTGGGC 


GTGTGACGGG 


3660 


GATACGGACT 


GCCAGGATGG 


TTCCGATGAG 


GATCCAGTCA 


ACTGTGAGAA 


GAAGTGCAAT 


3720 


GGATTCCGCT 


GCCCAAACGG 


CACTTGCATC 


CCATCCAGCA 


AACATTGTGA 


TGGTCTGCGT 


3730 


GATTGCT CTG 


ATGGCTCCGA 


TGAACAGCAC 


TGCGAGCCCC 


TCTGTACGCA 


CTTCATGGAC 


3840 


TTTGT GTGT A 


AGAACCGCCA 


GCAGTGC CTG 


TTCCACTCCA 


TGGTCTGTGA 


CGGAATCATC 


3900 


CAGTGCCGCG 


ACGGGTCCGA 


TGAGGATGCG 


GCGTTTGCAG 


GATGCTCCCA 


AGATCCTGAG 


3S60 


TTCCACAAGG 


TATGTGATGA 


GTTCGGTTTC 


CAGTGTCAGA 


ATGGAGTGTG 


CATCAGTTTG 


4 C 20 


ATTTGGAAGT 


GCGACGGGAT 


GGATGATTGC 


GGCGATTATT 


CTGATGAAGC 


CAACTGCGAA 


4C80 


AACCCCACAG 


AAGCCCCAAA 


CTGCTCCCGC 


TACTTCCAGT 


TTCGGTGTGA 


GAATGGCCAC 


4140 


TGCATCCCCA 


ACAGATGGAA 


ATGTGACAGG 


GAGAACGACT 


GTGGGGACTG 


GTCTGATGAG 


4200 


AAGGATTGTG 


GAGATTCACA 


TATTCTTCCC 


TTCTCGACTC 


CTGGGCCCTC 


CACGTGTCTG 


4260 


CCCAATTACT 


ACCGCTGCAG 


CAGTGGGACC 


TGCGTGATGG 


ACACCTGGGT 


GTGCGACGGG 


4320 


TACCGAGATT 


GTGCAGATGG 


CTCTGACGAG 


GAAGCCTGCC 


CCTTGCTTGC 


AAACGTCACT 


4330 


GCTGCCTCCA 


CTCCCACCCA 


ACTTGGGCGA 


TGTGACCGAT 


TTGAGTTCGA 


ATGCCACCAA 


4440 


CCGAAGACGT 


GTATTCCCAA 


CTGGAAGCGC 


TGTGACGGCC 


ACCAAGATTG 


CCAGGATGGC 


4500 


CGGGACGAGG 


CCAATTGCCC 


CACACACAGC 


ACCTTGACTT 


GCATGAGCAG 


GGAGTTCCAG 


4560 


TGCGAGGACG 


GGGAGGCCTG 


CATTGTGCTC 


TCGGAGCGCT 


GCGACGGCTT 


CCTGGACTGC 


4620 


TCGGACGAGA 


GCGATGAAAA 


GGCCTGCAGT 


GATGAGTTGA 


CTGTGTACAA 


AGTACAGAAT 


4680 


CTTCAGTGGA 


CAGCTGACTT 


CTCTGGGGAT 


GTGACTTTGA 


CCTGGATGAG 


GCCCAAAAAA 


4740 


ATGCCCTCTG 


CATCTTGTGT 


ATATAATGTC 


TACTACAGGG 


TGGTTGGAGA 


GAGCATATGG 


4800 


AAGACTCTGG 


AGACCCACAG 


CAATAAGACA 


AACACTGTAT 


TAAAAGTCTT 


GAAACCAGAT 


4860 


ACCACGTATC 


AGGTTAAAGT 


ACAGGTTCAG 


TGTCTCAGCA 


AGGCACACAA 


CACCAATGAC 


4920 
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TTTGTGACCC 


TGAGGACCCC 


AGAGGGATTG 


C CAG ATGC CC 


CTCGAAATCT 


CCAGCTGTCA 


4 ?80 




CTCCCCAGGG 


AAGCAGAAGG 


TGTGATTGTA 


GGCCAL. 1 GGG 


Li L LI L C CAT 


CCACACCCAT 


5040 




GGCCTCATCC 


GTGAGTACAT 


TG r AG AAT AC 


AGCAGGAG rG 


bl I LLAAGAT 


GTGGGCCTCC 


5100 




CAGAGGGCTG 


CTAGTAACTT 


TACAGAAATC 


AAGAACTTAT 


TGGTCAACAC 


TCTATACACC 


5160 




GTCAGAGTGG 


CTGCGGTGAC 


TAGTCGTGGA 


ATAGGAAACT 


GGAGCG ATT C 


TAAATC CATT 


5220 


10 


ACCACCATAA 


AAGGAAAAGT 


GAT C CCA CCA 


CCAGATATCC 


ACATTGACAG 


CTATGGTGAA 


5280 




AATTATCTAA 


GCTTCACCCT 


GAC CATGG AG 


AGTGATAT CA 


AGGTGAATGG 


CTATGTGGTG 


5 34 0 




AACCTTTTCT 


GGGCATTTGA 


CACCCACAAG 


CAAGAGAGGA 


GAACTTTGAA 


CTT C CGAGG A 


54 00 


15 


AGCATATTGT 


CACACAAAGT 


TGGCAATCTG 


ACAGCTCATA 


CAT C CT ATGA 


GATTTCTGCC 


5 4 6 0 




TGGGCCAAGA 


CTGACTTGGG 


GGATAGCCCT 


CTGGCATTTG 


AGCATGTTAT 


GAC CAGAGGG 


c c ~) n 
D 3 Z □ 




GTTCGCCCAC 


CTGCACCTAG 


CCTCAAGGCC 


AAAGCCATCA 


AC CAG ACTGC 


AGTGGAATGT 


C C fl A 


20 


ACCTGGACCG 


GCCCCCGGAA 


TGTGGTTTAT 


GGTATTTTCT 


ATGCCACGTC 


CTTTCTTGAC 


c c a n 
j b 4 0 




CTCTATCGCA 


ACCCGAAGAG 


CTTGACTACT 


TCACTCCACA 


ACAAGACGGT 


CA 1 i G I CAG 1 


D / uu 




AAGGATGAGC 


AGTATTTGTT 


TCTGGTCCGT 


GTAGTGGTAC 


C CT AC C AGGG 


GCCATCCTCT 


ETC A 
D / D U 


25 


GACTACGTTG 


TAGTGAAGAT 


GATCCCGGAC 


AGCAGGCTTC 


CACCCCGTCA 


LL I uLA 1 G i G 


3 o* « 


GTT CAT A CGG 


GCAAAACCTC 


CGTGGTCATC 


AAGTGGGAAT 


CACCGTATGA 


rTrrrrT^ a ^ 


5 3 8 0 




CAGGACTTGT 


TGTATGCAAT 


TGCAGTCAAA 


GATCTCATAA 


GAAAGACTGA 


LAuuAuL 1 


5 94 0 




AAAGTAAAAT 


CCCGTAACAG 


CACTGTGGAA 


T ACAC C CTTA 


ACAAG i l uGA 




6 0 0 0 


30 


AAATACCACA 


TCATTGTCCA 


ACTGGGGAAC 


ATGAGCAAAG 


ATT C CAGCAT 


AAAAAi i ALU 


6 0 6 0 




ACAGTTTCAT 


TATCAGCACC 


TGATGCCTTA 


AAAATCATAA 






6 12 0 




CTGTTTTGGA 


AAAGCCTGGC 


TTTAAAGGAA 


AAGCATTTTA 


ATGAAAGCAG 




613 0 


c 


ATACACATGT 


TTGATAGTGC 


CATGAATATC 


ACAG CTT AC C 


I 1 GGGAA 1 AG 




5 2 4 0 




TTCTTTAAAA 


TTTCCAACCT 


GAAGATGGGT 


CATAATTACA 


CGTTCACCGT 


L LAAu LAAun 


6 3 0 0 




TGCCTTTTTG 


GCAACCAGAT 


CTGTGGGGAG 


CCTGCCATCC 


rp^-i / "TV **T* S fTIS 

I 1 G 1 ALUA 




6 3 6 0 


40 


TCTGGTGCAG 


ATGCATCTGC 


AACGCAGGCT 


G C CAG AT CT A 


CGGATGTTGC 


TGCTGTGGTG 


64 2 0 




GTGCCCATCT 


TATTCCTGAT 


ACTGCTGAGC 


CTGGGGGTGG 


GGTTTGCCAT 


C CTGT ACACG 


6480 




AAGCACCGGA 


GGCTGCAGAG 


CAGCTTCACC 


GCCTTCGCCA 


ACAGCCACTA 


CAGCTCCAGG 


6540 


45 


CTGGGGTCCG 


CAATCTTCTC 


CTCTGGGGAT 


GACCTGGGGG 


AAGATGATGA 


AGATGCCCCT 


6600 




ATGATAACTG 


GATTTTCAGA 


TGACGTCCCC 


ATGGTGATAG 


CC 




6542 



,2) INFORMATION FCR SEQ ID NO : 6: 

50 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2214 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Ala Thr Arg Ser Ser Arg Arg Glu Ser Arg Leu Pro Phe Leu Phe 

15 10 15 

Thr Leu Val Ala Leu Leu Pro Pre Gly Ala Leu Cys Glu Val Trp Thr 

20 25 30 

Gin Arg Leu His Gly Gly Ser Ala Pro Leu Pro Gin Asp Arg Gly Phe 
35 40 45 

Leu Val Val Gin Gly Asp Pro Arg Glu Leu Arg Leu Trp Ala Arg Gly 

50 55 60 

Asp Ala Arg Gly Ala Ser Arg Ala Asp Glu Lys Pro Leu Arg Arg Lys 

65 70 75 80 

Arg Ser Ala Ala Leu Gin Pro Glu Pro lie Lys Val Tyr Gly Gin Val 

35 90 S5 

Ser Leu Asn Asp Ser His Asn Gin Met Val Val His Trp Ala Gly Glu 

ICO 105 110 

Lys Ser Asn Val lie Val Ala Leu Ala Arg Asp Ser Leu Ala Leu Ala 
115 120 125 

Arg Pro Lys Ser Ser Asp Val Tyr Val Ser Tyr Asp Tyr Gly Lys Ser 

130 * 135 140 

Fhe Lys Lvs lie Ser Asp Lys Leu Asn Phe Gly Leu Gly Asn Arg Ser 
145 ' 150 155 160 

Glu Ala Val lie Ala Gin Phe Tyr His Ser Pro Ala Asp Asn Lys Arg 
165 170 175 

Tyr lie Phe Ala Asp Ala Tyr Ala Gin Tyr Leu Trp lie Thr Fhe Asp 

ISO 185 190 

Phe Cys Asn Thr Leu Gin Gly Phe Ser lie Pro Phe Arg Ala Ala Asp 
195 200 205 

Leu Leu Leu His Ser Lys Ala Ser Asn Leu Leu Leu Gly Phe Asp Arg 
210 215 220 

Ser His Pro Asn Lys Gin Leu Trp Lys Ser Asp Asp Phe Gly Gin Thr 
225 230 235 240 

Trp Tie Met lie Gin Glu His Val Lys Ser Phe Ser Trp Gly lie Asp 
245 250 255 

Fro Tyr Asp Lys Pro Asn Thr lie Tyr He Glu Arg His Glu Fro Ser 
260 265 270 

Gly Tyr Ser Thr Val Phe Arg Ser Thr Asp Phe Phe Gin Ser Arg Glu 
275 280 285 
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Asn Gin Glu Val He Leu Glu Glu Val Arg Asp ?h* Clr. Lau Arg Arp 
290 295 300 

Lys Tyr Met Phe Ala Thr Lys Val Val His Leu Leu Gly Ser- Glu Gin 
305 310 315 320 

Gin Ser Ser Val Gin Leu Trp Val Ser Phe Gly Arg Lys Pro Met Arg 

325 330 335 

Ala Ala Gin Phe Val Thr Arg His Pro He Asn Glu Tyr Tyr He Ala 
340 345 350 

Asp Ala Ser Glu Asp Gin Val Phe Val Cys Val Ser His Ser Asn Asn 
355 360 365 

Arq Thr Asn Leu Tyr He Ser Glu Ala Glu Gly Leu Lys Phe Ser Leu 

370 375 380 

Ser Leu Glu Asn Val Leu Tyr Tyr Ser Pro Gly Gly Ala Gly Ser Asp 
385 390 395 400 

Th- Leu Val Arg Tyr Phe Ala Asn Glu Pro Phe Ala Asp Phe His Arg 
405 410 415 

Val Glu Gly Leu Gin Gly Val Tyr He Ala Thr Leu He Asn Gly Ser 
420 425 430 

Met Asn Glu Glu Asn Met Arg Ser Val He Thr Phe Asp Lys Gly Gly 
435 440 445 

Thr Trp Glu Phe Leu Gin Ala Pro Ala Phe Thr Gly Tyr Gly Glu Lys 
450 455 460 

He Asn Cys Glu Leu Ser Gin Gly Cys Ser Leu His Leu Ala Gin Arg 
465 470 475 480 



Leu Ser Gin Leu Leu Asn Leu Gin Leu Arg Arg Met Pro He Leu Sea 

485 490 495 

Lys Glu Ser Ala Pro Gly Leu He He Ala Thr Gly Ser Val Gly Ly; 

1 ^ r, r, c fit: 510 



500 



Asn Leu Ala Ser Lys Thr Asn Val Tyr He Ser Ser Ser Ala Gly Ala 
515 520 525 

Arq Trp Arg Glu Ala Leu Pro Gly Pro His Tyr Tyr Thr Trp Gly Asp 
530 535 540 

His Gly Gly He He Thr Ala He Ala Gin Gly Met Glu Thr Asn Glu 
545 550 555 560 

Leu Lys Tyr Ser Thr Asn Glu Gly Glu Thr Trp Lys Thr Phe lie Phe 
565 570 575 

Ser Glu Lys Pro Val Phe Val Tyr Gly Leu Leu Thr Glu Pro Gly Glu 

585 590 



580 



Lys Ser Thr Val Phe Thr He Phe Gly Ser Asn Lys Glu Asn Val His 

595 600 605 

Ser Trp Leu He Leu Gin Val Asn Ala Thr Asp Ala Leu Gly Val Pro 
610 615 620 

Cys Thr Glu Asn Asp Tyr Lys Leu Trp Ser Pro Ser Asp Glu Arg Gly 
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625 630 635 610 

Asn Glu Cys Leu Leu Gly His Lys Thr Val Phe Lys Ar:? Arg Thr Pro 
645 650 . 655 

His Ala Thr Cys Phe Asn Gly Glu Asp Phe Asp Arg Pro Val Val Val 
660 665 670 

Ser Asn Cys Ser Cys Thr Arg Glu Asp Tyr Glu Cys Asp Phe Gly Phe 
675 680 685 

Lys Met Ser Glu Asp Leu Ser Leu Glu Val Cys Val Pro Asp Pro Glu 
690 695 700 

Phe Ser Gly Lys Ser Tyr Ser Pro Pro Val Pro Cys Pro Val Gly Ser 

705 713 715 720 

Thr Tyr Arg Arg Thr Arg Gly Tyr Arg Lys lie Ser Gly Asp Thr Cys 
725 730 735 

Ser Gly Gly Asp Val Glu Ala Arg Leu Glu Gly Glu Leu Val Pro Cys 
740 745 750 

Pro Leu Ala Glu Glu Asn Glu Phe lie Leu Tyr Ala Val Arg Lys Ser 
755 760 765 

lie Tyr Arg Tyr Asp Leu Ala Ser Gly Ala Thr Glu Gin Leu Pro Leu 
770 775 780 

Thr Gly Leu Arg Ala Ala Val Ala Leu Asp Phe Asp Tyr Glu His Asn 

IBS 790 795 800 

Cys Leu Tyr Trp Ser Asp Leu Ala Leu Asp Val lie Gin Arg Leu Cys 
805 810 815 

30 Le'j Asn Gly Ser Thr Gly Gin Glu Val lie lie Asn Ser Gly Leu Glu 

820 825 630 

Thr Val Glu Ala Leu Ala Phe Glu Pro Leu Ser Gin Leu Leu Tyr Trp 
835 840 845 

35 Val Asp Ala Gly Phe Lys Lys lie Glu Val Ala Asn Pro Asp Gly Asp 

eSO 355 860 

Phe Arg Leu Thr He Val Asn Ser Ser Val Leu Asp Arg Pro Arg Ala 
365 870 875 880 

40 Leu Val Leu Val Pro Gin Glu Gly Val Met Phe Trp Thr Asp Trp Gly 

885 890 895 

Asp Leu Lys Pro Gly He Tyr Arg Ser Asn Met Asp Gly Ser Ala Ala 
900 905 910 

Tyr His Leu Val Ser Glu Asp Val Lys Trp Pro Asn Gly He Ser Val 
4z 915 920 925 

Asp Asp Gin Trp He Tyr Trp Thr Asp Ala Tyr Leu Glu Cys He Glu 
93C 935 940 

Arg lie Thr Phe Ser Gly Gin Gin Arg Ser Val He Leu Asp Asn Leu 
5- 945 950 955 960 

Pro His Pro Tyr Ala He Ala Val Phe Lys Asn Glu He Tyr Trp Asp 
965 970 975 

55 
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Asp Trp Ser Gin Leu Ser lie Phe Arg Ala Ser Lyn Tyi Ser Gly Ser 

980 985 990 

Gin M**t Glu lie Leu Ala Asn Gin Leu Thr Gly Leu Met Asp Met Lys 
995 1000 1005 

lie Phe Tyr Lys Gly Lys Asn Thr Gly Ser Asn Ala Cys Val Pro Arg 
1010 1015 1020 

Pro Cvs Ser Leu Leu Cys Leu Pro Lys Ala Asn Asn Ser Arg Ser Cys 
1025 1030 1035 1040 

Ara Cvs Pro Glu Asp Val Ser Ser Ser Val Leu Pro Ser Gly Asp Leu 
* 2 1045 1050 1055 

Met Cys Asp Cys Pro Gin Gly Tyr Gin Leu Lys Asn Asn Thr Cys Val 
1060 1065 1070 

Lvs Glu Glu Asn Thr Cys Leu Arg Asn Gin Tyr Arg Cys Ser Asn Gly 
y 1075 1080 1085 

Asn Cys lie Asn Ser He Trp Trp Cys Asp Phe Asp Asn Asp Cys Gly 

1090 1095 HOO 

Asp Met Ser Asp Glu Arg Asn Cys Pro Thr Thr lie Cys Asp Leu Asp 
1105 1110 1115 

Thr Gin Phe Arg Cys Gin Glu Ser Gly Thr Cys He Pro Leu Ser Tyr 
1125 H30 H35 

Lvs Cys Asp Leu Glu Asp Asp Cys Gly Asp Asn Ser Asp Glu Ser His 
2 2 1145 H50 



1140 



Cvs Glu Met His Gin Cys Arg Ser Asp Glu Tyr Asn Cys Ser Ser Gly 
1155 H60 H 65 

Met Cys He Arg Ser Ser Trp Val Cys Asp Gly Asp Asn Asp Cys Arg 
1170 H75 H80 

Asp Trp Ser Asp Glu Ala Asn Cys Thr Ala He Tyr His Thr Cys Glu 
1185 H90 H95 

Ala Ser Asn Phe Gin Cys Arg Asn Gly His Cys He Pro Gin Arg Trp 
1210 I* 15 



1205 



Ala Cys Asp Gly Asp Thr Asp Cys Gin Asp Gly Ser Asp Glu Asp Pro 
1225 l-2.su 



1220 



val Asn Cys Glu Lys Lys Cys Asn Gly Phe Arg Cys Pro Asn Gly Thr 
1235 1240 1245 

Cys lie Pro Ser Ser Lys His Cys Asp Gly Leu Arg Asp Cys Ser Asp 
1250 1255 1260 

Gly Ser Asp Glu Gin His Cys Glu Pro Leu Cys Thr His Phe Met Asp 
1265 1270 12" 

Phe Val Cys Lys Asn Arg Gin Gin Cys Leu Phe His Ser Met Val Cys 
1285 1290 

Asp Gly lie He Gin Cys Arg Asp Gly Ser Asp Glu Asp Ala Ma Phe 

V 1300 1305 I 310 
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Ala Gly Cys Ser Gin Asp Pro Glu Phe His Lys Val Cys Acp Clu Pft- 
1315 1320 1325 

Gly Phe Gin Cys Gin Asn Gly Val Cys lie Ser Leu lie Trp Lys ry S 
1330 1335 1340 

Asp Gly Met Asp Asp Cys Gly Asp Tyr Ser Asp Glu Ala Asn Cys Glu 
1345 1350 1355 1360 

Asn Pro Thr Glu Ala Pro Asn Cys Ser Arg Tyr Phe Gin Phe Arg Cys 
1365 1370 1375 

Glu Asn Gly His Cys lie Pro Asn Arg Trp Lys Cys Asp Arg Glu Asn 
1380 1385 1390 

Asp Cys Gly Asp Trp Ser Asp Glu Lys Asp Cys Gly Asp Ser His He 
1395 1400 1405 

Leu Pro Phe Ser Thr Pro Gly Pre Ser Thr Cys Leu Pro Asn Tyr Tyr 
1410 1415 1420 

Arg Cys Ser Ser Gly Thr Cys Val Met Asp Thr Trp Val Cys Asp Gly 
1425 1430 1435 " 1440 

Tyr Arg Asp Cys Ala Asp Gly Ser Asp Glu Glu Ala Cys Pro Leu Leu 
1445 1450 1455 

Ala Asn Val Thr Ala Ala Ser Thr Pro Thr Gin Leu Gly Arg Cys Asp 
146C 1465 1470 

Arg Phe Glu Phe Glu Cys His Gin Pro Lys Thr Cys He Pro Asn Trp 
1475 1480 1485 

Lys Arg Cys Asp Gly His Gin Asp Cys Gin Asp Gly Arg Asp Glu Ala 
1490 1495 1530 

Asn Cys Pro Thr His Ser Thr Leu Thr Cys Met Ser Arg Glu Phe Gin 
15C5 1510 1515 1520 

Cys Glu Asp Gly Glu Ala Cys He Val Leu Ser Glu Arg Cys Asp Gly 
1525 1530 1535 

Phe Leu Asp Cys Ser Asp Glu Ser Asp Glu Lys Ala Cys Ser Asp Glu 
1540 1545 1550 

Leu Thr Val Tyr Lys Val Gin Asn Leu Gin Trp Thr Ala Asp Phe Ser 
1555 1560 1565 

Gly Asp Val Thr Leu Thr Trp Met Arg Pro Lys Lys Met Pro Ser Ala 
1570 1575 1530 

Ser Cys Val Tyr Asn Val Tyr Tyr Arg Val Val Gly Glu Ser He Trp 
1585 1590 1595 1600 

Lys Thr Leu Glu Thr His Ser Asn Lys Thr Asn Thr Val Leu Lys Val 
1605 1610 1615 

Leu Lys Pro Asp Thr Thr Tyr Gin Val Lys Val Gin Val Gin Cys Leu 
1620 1625 1630 

Ser Lys Ala His Asn Thr Asn Asp Phe Val Thr Leu Arg Thr Pro Glu 
1635 1640 1645 

Gly Leu Pro Asp Ala Pro Arg Asn Leu Gin Leu Ser Leu Pro Arg Glu 
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1650 1655 \5b0 

Ala Glu Gly Val He Val Gly His Trp Ala Pro Pro lie His Thr His 
1665 1670 1675 . 1680 

Gly Leu He Arg Glu Tyr He Val Glu Tyr Ser Arg Ser Gly Ser Lys 
1685 1690 1695 

Met Trp Ala Ser Gin Arg Ala Ala Ser Asn Phe Thr Glu He Lys Asn 
1700 1705 1710 

Leu Leu Val Asn Thr Leu Tyr Thr Val Arg Val Ala Ala Val Thr Ser 
1715 1720 1725 

Arq Gly He Gly Asn Trp Ser Asp Ser Lys Ser He Thr Thr He Lys 
--i- 1735 1740 



1730 



Glv Lys Val He Pro Pro Pro Asp He His He Asp Ser Tyr Gly Glu 
17 45 1750 1755 1760 

Asn Tyr Leu Ser Phe Thr Leu Thr Met Glu Ser Asp He Lys Val Asn 
1765 1770 1775 

Glv Tyr Val Val Asn Leu Phe Trp Ala Phe Asp Thr His Lys Gin Glu 
y Y 1780 1785 1790 

Arq Arg Thr Leu Asn Phe Arg Gly Ser He Leu Ser His Lys Val Gly 
1795 1800 1305 

Asn Leu Thr Ala His Thr Ser Tyr Glu He Ser Ala Trp Ala Lys Thr 
1810 1815 1820 

Asp Leu Gly Asp Ser Pro Leu Ala Phe Glu His Val Met Thr Arg Gly 
18 25 1830 1835 1840 

Val Arq Pro Pro Ala Pro Ser Leu Lys Ala Lys Ala He Asn Gin Thr 
1845 1850 1355 

Ala Val Glu Cys Thr Trp Thr Gly Pro Arg Asn Val Val Tyr Gly He 
I860 1865 1870 

Phe Tyr Ala Thr Ser Phe Leu Asp Leu Tyr Arg Asn Pro Lys Ser Leu 
187 5 1880 1885 

-hr Thr Ser Leu His Asn Lys Thr Val lie Val Ser Lys Asp Glu Gin 
1890 1895 1900 

V Leu Phe Leu Val Arg Val Val Val Pro Tyr Gin Gly Pro Ser Ser 
igOS 1910 1915 1920 

Asp Tyr Val Val Val Lys Met He Pro Asp Ser Arg Leu Pro Pro Arg 
* 1925 1930 1935 

His Leu His Val Val His Thr Gly Lys Thr Ser Val Val lie Lys Trp 
1940 1945 1950 

Glu Ser Pro Tyr Asp Ser Pro Asp Gin Asp Leu Leu Tyr Ala He Ala 
1955 I960 1965 

val Lys Asp Leu He Arg Lys Thr Asp Arg Ser Tyr Lys Val Lys Ser 
1970 1975 1980 

Arg Asn Ser Thr Val Glu Tyr Thr Leu Asn Lys Leu Glu Pro Gly Gly 
19 85 1990 1995 2000 
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Lys Tyr His lie lie Val Gin Leu Gj.y Asn Ser Lys Asp S~r Ser 

2005 2010 2015 

He Lys He Thr Thr Val Ser Leu Ser Ala Pro Asp Ala Leu Lys He 
2020 2025 2030 

He Thr Glu Asn Asp His Val Leu Leu Phe Trp Lys Ser Leu Ala Leu 
2035 2040 2045 

Lys Glu Lys His Phe Asn Glu Ser Arg Gly Tyr Glu He His Met Phe 
2050 2055 2060 

Asp Ser Ala Met Asn He Thr Ala Tyr Leu Gly Asn Thr Thr Asp Asn 
2065 2070 2075 2080 

Phe Phe Lys lie Ser Asn Leu Lys Met Gly His Asn Tyr Thr Phe Thr 
2085 2090 2095 

Val Gin Ala Arg Cys Leu Phe Gly Asn Gin He Cys Gly Glu Pro Ala 
2100 2105 2110 

He Leu Leu Tyr Asp Glu Leu Gly Ser Gly Ala Asp Ala Ser Ala Thr 
2115 212C 2125 

Gin Ala Ala Arg Ser Thr Asp Val Ala Ala Val Val Val Fro He Leu 
2130 2135 2140 

Phe Leu He Leu Leu Ser Leu Gly Val Gly Phe Ala lie Leu Tyr Thr 
2145 2150 2155 2160 

Lys His Arg Arg Leu Gin Ser Ser Phe Thr Ala Phe Ala Asn Ser His 
2165 2170 2175 

Tyr Ser Ser Arg Leu Gly Ser Ala lie Phe Ser Ser Gly Asp Asp Leu 
2180 2185 2190 

Gly Glu Asp Asp Glu Asp Ala Pro Met lie Thr Gly Phe Ser Asp Asp 
2195 220C 2205 

Val Pro Met Val He Ala 
2210 

(2) INFORMATION FOR SEQ ID NO : 7: 

(!) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6843 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(iii MOLECULE TYPE: cDNA to mRNA 

dx) FEATURE: 

(A) NAME / KEY : sig peptide 

(B) LOCATION: 81 . . 164 

(C) IDENTIFICATION METHOD: S 

(ix) FEATURE: 

(A) NAME/KEY: mat peptide 

(B) LOCATION: 165. .6722 

(C) IDENTIFICATION METHOD: S 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

CCG GCCCAGCGGC TCTCCTGGCC 23 
TCGCGCTG ZA CATTCTCTCC TGGCGGC GG C GCCACCTGCA GTAGCGTTCG CCCGAACATG 83 
s Met 

1 

GCG ACA CGG AGC AGC AGO AGG GAG TCG CGA CTC CCG TTC CTA TTC ACC 131 
Ala Thr Arg Ser Ser Arg Arg Glu Ser Arg Leu Pro Phe Leu Phe Thr 

5 10 15 

CTG GTC GCA CTG CTG CCG CCC GGA GCT CTC TGC GAA GTC TGG ACG CAG 179 
W Leu Val Ala Leu Leu Pro Pro Gly Ala Leu Cys Glu Val Trp Thr Gin 

20 25 30 

AGG CTG CAC GGC GGC AGC GCG CCC TTG CCC CAG GAC CGG GGC TTC CTC 2 27 

Arg Leu His Gly Gly Ser Ala Pro Leu Pro Gin Asp Arg Gly Phe Leu 

35 40 45 

GTG GTG CAG GGC GAC CCG CGC GAG CTG CGG CTG TGG GCG CGC GGG GAT 27 5 

r5 val Val Gin Gly Asp Pro Arg Glu Leu Arg Leu Trp Ala Arg Gly Asp 

50 55 60 65 

GCC AGG GGG GCG AGC CGC GCG GAC GAG AAG CCG CTC CGG AGG AAA CGG 323 
Ala Arg Gly Ala Ser Arg Ala Asp Glu Lys Pro Leu Arg Arg Lys Arg 

70 75 80 

AGC GCT GCC CTG CAG CCC GAG CCC ATC AAG GTG TAC GGA CAG GTT AGT 371 
Ser Ala Ala Leu Gin Pro Glu Pro He Lys Val Tyr Gly Gin Val Ser 
20 85 90 95 

CTG AAT GAT TCC CAC AAT CAG ATG GTG GTG CAC TGG GCT GGA GAG AAA 419 
Leu Asn Asp Ser His Asn Gin Met Val Val His Trp Ala Gly Glu Lys 

100 105 HO 

AGC AAC GTG ATC GTG GCC TTG GCC CGA GAT AGC CTG GCA TTG GCG AGG 467 
Ser Asn Val He Val Ala Leu Ala Arg Asp Ser Leu Ala Leu Ala Arg 
25 115 120 125 

CCC AAG AGC AGT GAT GTG TAC GTG TCT TAC GAC TAT GGA AAA TCA TTC 515 

Pro Lys Ser Ser Asp Val Tyr Val Ser Tyr Asp Tyr Gly Lys Ser Phe 

130 135 140 145 

AAG AAA ATT TCA GAC AAG TTA AAC TTT GGC TTG GGA AAT AGG AGT GAA 563 

Lvs Lys lie Ser Asp Lys Leu Asn Phe Gly Leu Gly Asn Arg Ser Glu 

150 155 160 

GCT GTT ATC GCC CAG TTC TAC CAC AGC CCT GCG GAC AAC AAG CGG TAC 611 
Ala val lie Ala Gin Phe Tyr His Ser Pro Ala Asp Asn Lys Arg Tyr 

165 170 175 

ATC TTT GCA GAC GCT TAT GCC CAG TAC CTC TGG ATC ACG TTT GAC TTC 659 
He Phe Ala Asp Ala Tyr Ala Gin Tyr Leu Trp He Thr Phe Asp Phe 

180 185 190 

TGC AAC ACT CTT CAA GGC TTT TCC ATC CCA TTT CGG GCA GCT GAT CTC 707 
Cys Asn Thr Leu Gin Gly Phe Ser He Pro Phe Arg Ala Ala Asp Leu 

195 200 205 

CTC CTA CAC AGT AAG GCC TCC AAC CTT CTC TTG GGC TTT GAC AGG TCC 7 55 
Leu Leu His Ser Lys Ala Ser Asn Leu Leu Leu Gly Phe Asp Arg Ser 
210 215 220 225 

40 CAC CCC AAC AAG CAG CTG TGG AAG TCA GAT GAC TTT GGC CAG ACC TGG 8 03 

His Pro Asn Lys Gin Leu Trp Lys Ser Asp Asp Phe Gly Gin Thr Trp 

230 235 240 

ATC ATG ATT CAG GAA CAT GTC AAG TCC TTT TCT TGG GGA ATT GAT CCC 851 
He Met He Gin Glu His Val Lys Ser Phe Ser Trp Gly He Asp Pro 
245 250 255 

45 TAT GAC AAA CCA AAT ACC ATC TAC ATT GAA CGA CAC GAA CCC TCT GGC 899 

Tyr Asp Lys Pro Asn Thr He Tyr He Glu Arg His Glu Pro Ser Gly 

260 265 270 

TAC TCC ACT GTC TTC CGA AGT ACA GAT TTC TTC CAG TCC CGG GAA AAC 947 
Tyr Ser Thr Val Phe Arg Ser Thr Asp Phe Phe Gin Ser Arg Glu Asn 
275 280 285 

'0 CAG GAA GTG ATC CTT GAG GAA GTG AGA GAT TTT CAG CTT CGG GAC AAG 9 95 

Gin Glu Val He Leu Glu Glu Val Arg Asp Phe Gin Leu Arg Asp Lys 
290 295 300 305 
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35 



TAC 


ATG 


TTT 


GCT 


ACA 


AAG 


GTG 


GTG 


CAT 


CTC 


TTG 


GGC 


ACT 


GAA 


CAC 




1043 


Tyr 


Met 


Phe 


Ala 


Thr 


Lys 


Val 


Val 


His 


Leu 


Leu 


Gly 


Ser 


Glu 


C-Ln 


Gin 












310 










315 










320 






TCT 


TCT 


GTC 


CAG 


CTC 


TGG 


GTC 


TCC 


TTT 


GGC 


CGG 


AAG 


CCC 


ATG 


AGA 


GCA 


1091 


Ser 


Ser 


Val 


Gin 


Leu 


Trp 


Val 


Ser 


Phe 


Gly 


Arg 


Lys 


Pro 


Met 


Arcr 


Ala 










325 










330 










335 






GCC 


CAG 


TTT 


GTC 


ACA 


AGA 


CAT 


CCT 


ATT 


AAT 


GAA 


TAT 


TAC 


ATC 


GCA 


GAT 


1139 


Ala 


Gin 


Phe 


Val 


Thr 


Arg 


His 


Pro 


He 


Asn 


Glu 


Tyr 


Tyr 


He 


Ala 


Asp 








340 










345 










350 








GCC 


TCC 


GAG 


GAC 


CAG 


GTG 


TTT 


GTG 


TGT 


GTC 


AGC 


CAC 


AGT 


AAC 


AAC 


CGC 


1187 


Ala 


Ser 


Glu Asp Gin Val 


Phe 


Val 


Cys 


Val 


Ser 


His 


Ser 


Asn 


Asn 


Arg 






355 










360 










365 










ACC 


AAT 


TTA 


TAC 


ATC 


TCA 


GAG 


GCA 


GAG 


GGG 


CTG 


AAG 


TTC 


TCC 


CTG 


TCC 


1235 


Thr 


Asn 


Leu 


Tvr 


He 


Ser 


Glu 


Ala 


Glu 


Gly 


Leu 


Lys 


Phe 


Ser 


Leu 


Ser 




370 










375 










380 










385 




TTG 


GAG 


AAC 


GTG 


CTC 


TAT 


TAC 


AGC 


CCA 


GGA 


GGG 


GCC 


GGC 


AGT 


GAC 


ACC 


1283 


Leu 


Glu 


Asn 


Val 


Leu 


Tyr 


Tyr 


Ser 


Pro 


Gly 


Gly 


Ala 


Gly 


Ser 


Asp 


Thr 












390 










395 










400 






TTG 


GTG 


AGG 


TAT 


TTT 


GCA 


AAT 


GAA 


CCA 


TTT 


GCT 


GAC 


TTC 


CAC 


CGA 


GTG 


13 31 


Leu 


Val 


Arg 


Tvr 


Phe 


Ala 


Asn 


Glu 


Pro 


Phe 


Ala 


Asp 


Phe 


His 


Arg 


Val 










405 










410 










415 








GAA 


GGA 


TTG 


CAA 


GGA 


GTC 


TAC 


ATT 


GCT 


ACT 


CTG 


ATT 


AAT 


GGT 


TCT 


ATG 


1379 


Glu 


Gly 


Leu 


Gin 


Gly Val 


Tyr 


He 


Ala 


Thr 


Leu 


He 


Asn 


Gly 


Ser 


Met 








420 










425 










430 










AAT 


GAG 


GAG 


AAC 


ATG 


AGA 


TCG 


GTC 


ATC 


ACC 


TTT 


GAC 


AAA 


GGG 


GGA 


ACC 


1427 


Asn 


Glu 


Glu 


Asn 


Met 


Arg 


Ser 


Val 


He 


Thr 


Phe 


Asp 


Lys 


Gly 


Gly 


Thr 






435 










440 










445 












TGG 


GAG 


TTT 


CTT 


CAG 


GCT 


CCA 


GCC 


TTC 


ACG 


GGA 


TAT 


GGA 


GAG 


AAA 


ATC 


1475 


Trp 


Glu 


Phe 


Leu 


Gin 


Ala 


Pro 


Ala 


Phe 


Thr 


Gly 


Tyr 


Gly 


Glu 


Lys 


He 




450 










455 










460 










465 




AAT 


TGT 


GAG 


CTT 


TCC 


CAG 


GGC 


TGT 


TCC 


CTT 


CAT 


CTG 


GCT 


CAG 


CGC 


CTC 


1523 


Asn 


Cys 


Glu 


Leu 


Ser 


Gin Gly 


Cys 


Ser 


Leu 


His 


Leu 


Ala 


Gin 


Arg 


Leu 










470 










475 










480 






AGT 


GAG 


CTC 


CTC 


AAC 


CTC 


CAG 


CTC 


CGG 


AGA 


ATG 


ccc 


ATC 


CTG 


TCC 


AAG 


1571 


Ser 


Gin 


Leu 


Leu 


Asn 


Leu 


Gin 


Leu 


Arg 


Arg 


Met 


Pro 


He 


Leu 


Ser 


Lys 










485 










490 










495 








GAG 


TCG 


GCT 


CCA 


GGC 


CTC 


ATC 


ATC 


GCC 


ACT 


GGC 


TCA 


GTG 


GGA 


AAG 


AAC 


1519 


Glu 


Ser 


Ala 


Pro 


Gly 


Leu 


He 


He 


Ala 


Thr 


Gly 


Ser 


Val 


Gly 


Lys 


Asn 








5 00 










505 










510 










TTG 


GCT 


AGC 


AAG 


ACA 


AAC 


GTG 


TAC 


ATC 


TCT 


AGC 


AGT 


GCT 


GGA 


GCC 


AGG 


1667 


Leu 


Ala 


Ser 


Lys 


Thr 


Asn 


Val 


Tyr 


He 


Ser 


Ser 


Ser 


Ala 


Gly 


Ala 


Arg 






515 








520 










525 












TGG 


CGA 


GAG 


GGA 


CTT 


CCT 


GGA 


CCT 


CAC 


TAC 


TAC 


ACA 


TGG 


GGA 


GAC 


CAC 


1715 


Trp 


Arg 


Glu 


Ala 


Leu 


Pro 


Gly 


Pro 


His 


Tyr 


Tyr 


Thr 


Trp 


Gly 


Asp 


His 




530 










535 










540 










545 




GGC 


GGA 


ATC 


ATC 


ACG 


GCC 


ATT 


GCC 


CAG 


GGC 


ATG 


GAA 


ACC 


AAC 


GAG 


CTA 


1763 


Gly 


Gly 


He 


He 


Thr 


Ala 


He 


Ala 


Gin 


Gly 


Met 


Glu 


Thr 


Asn 


Glu 


Leu 












550 










555 










560 






AAA 


TAC 


AGT 


ACC 


AAT 


GAA 


GGG 


GAG 


ACC 


TGG 


AAA 


ACA 


TTC 


ATC 


TTC 


TCT 


1311 


Lys 


Tyr 


Ser 


Thr 


Asn 


Glu 


Gly 


Glu 


Thr 


Trp 


Lys 


Thr 


Phe 


He 


Phe 


Ser 










565 










570 










575 








GAG 


AAG 


CCA 


GTG 


TTT 


GTG 


TAT 


GGC 


CTC 


CTC 


ACA 


GAA 


CCT 


GGG 


GAG 


AAG 


1359 


Glu 


Lys 


Pro 


Val 


Phe 


Val 


Tyr 


Gly 


Leu 


Leu 


Thr 


Glu 


Pro 


Gly 


Glu 


Lys 








580 










585 










590 










AGC 


ACT 


GTC 


TTC 


ACC 


ATC 


TTT 


GGC 


TCG 


AAC 


AAA 


GAG 


AAT 


GTC 


CAC 


AGC 


1907 


Ser 


Thr 


Val 


Phe 


Thr 


He 


Phe 


Gly 


Ser 


Asn 


Lys 


Glu 


Asn 


Val 


His 


Ser 
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CTC 


CAG 


GTC 


AAT 


GCC 


ACG 


GAT 


GCC 


TTG 


GGA 


GTT 


CCC 


TGC 


1955 


Trp 
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He 


Leu 
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Val 


Asn 


Ala 


Thr 


Asp 


Ala 


Leu 


Gly 


Val 


Pro 


Cys 
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GAG 


AAT 


GAC 


TAC 


AAG 


CTG 


TGG 


TCA 
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GAT 


GAG 


CGG 


GGG 


AAT 


2003 


Thr 


Glu 


Asn 


Asp 


Tyr 


Lys 
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Trp 


Ser 


Pro 


Ser 


Asp 


Glu 


Arg 


Gly 
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TGT 
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GTT 
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AAA 


CGG 


CGG 
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CCC 


CAT 
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Glu Cys Leu Leu Gly His Lys Thr Val Phe Lys Arg Ara Thr Pro His 

645 650 " 655 

CCC ACA TGC TTC AAT GGA GAG GAC TTT GAC AGG CCG GTG GTC GTG TCC 209 9 
Ala Thr Cys Phe Asn Gly Glu Asp Phe Asp A-g Pro Val Val Val Ser 

5 660 665 670 

AAC TGC TCC TGC ACC CGG GAG GAC TAT GAG TGT GAC TTC GGT TTC AAG 214 7 
Asn Cys Ser Cys Thr Arg Glu Asp Tyr Glu Cys Asp Phe Gly Phe Lys 

675 680 685 

ATG AGT GAA GAT TTG TCA TTA GAG GTT TGT GTT CCA GAT CCG GAA TTT 2195 
Met Ser Glu Asp Leu Ser Leu Glu Val Cys Val Pro Asp Pro Glu Phe 
W 690 695 700 705 

TCT GGA AAG TCA TAC TCC CCT CCT GTG CCT TGC CCT GTG GGT TCT ACT 2 24 3 
Ser Gly Lys Ser Tyr Ser Pro Pro Val Pro Cys Pro Val Gly Ser Thr 

710 715 720 

TAC AGG AGA ACG AGA GGC TAC CGG AAG ATT TCT GGG GAC ACT TGT AGC 2291 
Tyr Arg Arg Thr Arg Gly Tyr Arg Lys lie Ser Gly Asp Thr Cys Ser 
75 7 2 5 7 3 0 7 3 5 

GGA GGA GAT GTT GAA GCG CGA CTG GAA GGA GAG CTG GTC CCC TGT CCC 233 9 
Gly Gly Asp Val Glu Ala Arg Leu Glu Gly Glu Leu Val Pro Cys Pro 

740 745 750 

CTG GCA GAA GAG AAC GAG TTC ATT CTG TAT GCT GTG AGG AAA TCC ATC 2 387 
Leu Ala Glu Glu Asn Glu Phe lie Leu Tyr Ala Val Arg Lys Ser lie 
755 760 765 

"° TAC CGC TAT GAC CTG GCC TCG GGA GCC ACC GAG CAG TTG CCT CTC ACC 243 5 

Tyr Arg Tyr Asp Leu Ala Ser Gly Ala Thr Glu Gla Leu Pro Leu Thr 
770 775 780 785 

GGG CTA CGG GCA GCA GTG GCC CTG GAC TTT GAC TAT GAG CAC AAC TGT 24 83 
Gly Leu Arg Ala Ala Val Ala Leu Asp Phe Asp Tyr Glu His Asn Cys 
790 795 800 

25 TTQ TAT T3G TCC GAC GCC i^TG GAC GTC ATC CAG CGC CTC TGT TTG 2 531 

Leu Tyr Trp Ser Asp Leu Ala Leu Asp Val lie Gin Arg Leu Cys Leu 

305 810 815 

AAT GGA AGC ACA GGG CAA GAG GTG ATC ATC AAT TCT GGC CTG GAG ACA 2 57 9 
Asn Gly Ser Thr Gly Gin Glu Val He He Asn Ser Gly Leu Glu Thr 
820 825 830 

20 GTA GAA GCT TTG GCT TTT GAA CCC CTC AGC CAG CTG CTT TAC TGG GTA 2 62 7 

Val Glu Ala Leu Ala Phe Glu Pro Leu Ser Gin Leu Leu Tyr Trp Val 

835 840 845 

GAT GCA GGC TTC AAA AAG ATT GAG GTA GCT AAT CCA GAT GGC GAC TTC 2 675 
Asp Ala Gly Phe Lys Lys lie Glu Val Ala Asn Pro Asp Gly Asp Phe 
850 855 860 865 

CGA CTC ACA ATC GTC AAT TCC TCT GTG CTT GAT CGT CCC AGG GCT CTG 272 3 
Arg Leu Thr He Val Asn Ser Ser Val Leu Asp Arg Pro Arg Ala Leu 

870 875 880 

GTC CTC GTG CCC CAA GAG GGG GTG ATG TTC TGG ACA GAC TGG GGA GAC 2 771 
Val Leu Val Pro Gin Glu Gly Val Met Phe Trp Thr Asp Trp Gly Asp 

885 890 895 

CTG AAG CCT GGG ATT TAT CGG AGC AAT ATG GAT GGT TCT GCT GCC TAT 2819 
40 Leu Lys Pro Gly He Tyr Arg Ser Asn Met Asp Gly Ser Ala Ala Tyr 

900 905 910 

CAC CTG GTG TCT GAG GAT GTG AAG TGG CCC AAT GGC ATC TCT GTG GAC 2 86 7 
His Leu Val Ser Glu Asp Val Lys Trp Pro Asn Gly lie Ser Val Asp 

915 920 925 

GAC CAG TGG ATT TAC TGG ACG GAT GCC TAC CTG GAG TGC ATA GAG CGG 2915 
45 Asp Gin Trp He Tyr Trp Thr Asp Ala Tyr Leu Glu Cys He Glu Arg 

930 935 940 945 

ATC ACG TTC AGT GGC CAG CAG CGC TCT GTC ATT CTG GAC AAC CTC CCG 2 96 3 
lie Thr Phe Ser Gly Gin Gin Arg Ser Val He Leu Asp Asn Leu Pro 

950 955 960 

CAC CCC TAT GCC ATT GCT GTC TTT AAG AAT GAA ATC TAC TGG GAT GAC 
His Pro Tyr Ala He Ala Val Phe Lys Asn Glu He Tyr Trp Asp Asp 

965 970 975 

TGG TCA CAG CTC AGC ATA TTC CGA GCT TCC AAA TAC AGT GGG TCC CAG 3 0 59 
Trp Ser Gin Leu Ser He Phe Arg Ala Ser Lys Tyr Ser Gly Ser Gin 
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TTC 
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AAG 


GGG 


AAG 


AAC 
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GGA 
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3155 
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Pro 
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Pro 
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AAC 


AGT 
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AGC 
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Leu 
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Pro 


Lys 


Ala 
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Asn 


Ser 


Arg 


Ser 


Cys 


Arg 

) 
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1035 
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TGT 


CCA 


GAG 


GAT 


GTG 


TCC 


AGC 


AGT 


GTG 


CTT 


CCA 


TCA 


GGG 


GAC 


CTG 


ATG 


3251 


Cys 


Pro 


Glu 


Asp 


Val 


Ser 


Ser 


Ser 


Val 


Leu 


Pro 


Ser 


Gly 


Asp 


Leu 


Met 
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1050 
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TGT 


GAC 
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CCT 
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TAT 


CAG 


CTC 
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TGT 


GTC 


AAA 
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Asn 


Thr 


Cys 

) 


Val 


Lys 








1060 
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CTT 


CGC 


AAC 


CAG 


TAT 
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3347 
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Glu 
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Gly 
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TGG 
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3395 
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He 
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Trp 


Trp 


Cys 
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Asp 


Asn 


Asp 
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Gly 
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1095 








1100 
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ATG 


AGC 


GAT 


GAG 


AGA 
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ATC 


TGT 
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3443 


Met 


Ser 


Asp 


Glu 


Arg 


Asn 
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Pro 


Thr 


Thr 


He 


Cys 


Asp 


Leu 


Asp 


Thr 












1110 








1115 








1120 
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TTT 


CGT 
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TCT 


GGG 
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TGT 


ATC 
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3491 


Gin 


Phe 


Arg 
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Glu 


Ser 


Gly 


Thr 


Cys 


He 


Pro 
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Ser 


Tyr 
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1135 
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AGT 
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3539 
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Glu 
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Asp 


Cys 


Gly 


Asp 


Asn 


Ser 


Asp 


Glu 


Ser 


His 


Cys 








1140 








1145 








1150 
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ATG 


CAC 


CAG 


TGC 


CGG 


AGT 


GAC 


GAG 


TAC 


AAC 


TGC 


AGT 


TCC 


GGC 


ATG 


3587 


Glu 


Met 


His 


Gin 


Cys 


Arg 


Ser 


Asp 


Glu 


Tyr 


Asn 


Cys 


Ser 


Ser 


Gly 


Met 






1155 








1160 








1165 










TGC 


ATC 


CGC 


TCC 


TCC 


TGG 


GTA 


TGT 


GAC 


GGG 


GAC 


AAC 


GAC 


TGC 


AGG 


GAC 


3635 


Cys 


Tie 


Arg 


Ser 


Ser 


Trp 


Val 


Cys 


Asp 


Gly 


Asp 


Asn 


Asp 


Cys 


Arg 


Asp 




1170 








1175 








1180 








1185 




TGG 


TCT 


GAT 


GAA 


GCC 


AAC 


TGT 


ACC 


GCC 


ATC 


TAT 


CAC 


ACC 


TGT 


GAG 


GCC 


3683 


Trp 


Ser 


Asp 


Glu 


Ala 


Asn 


Cys 


Thr 


Ala 


lie 


Tyr 


His 


Thr 


Cys 


Glu 


Ala 












1190 








1195 








12C0 




TCC 


AAC 


TTC 


CAG 


TGC 


CGA 


AAC 


GGG 


CAC 


TGC 


ATC 


CCC 


CAG 


CGG 


TGG 


GCG 


3731 


Ser 


Asn 


Phe 


Gin 


Cys 


Arg 


Asn 


Gly 


His 


Cys 


He 


Pro 


Gin 


Arg 


Trp 


Ala 










1205 








1210 








1215 






TGT 


GAC 


GGG 


GAT 


ACG 


GAC 


TGC 


CAG 


GAT 


GGT 


TCC 


GAT 


GAG 


GAT 


CCA 


GTC 


3779 


Cys 


Asp 


Gly 


Asp 


Thr 


Asp 


Cys 


Gin 


Asp 


Gly 


Ser 


Asp 


Glu 


Asp 


Pro 


Val 








1220 








1225 








1230 








AAC 


TGT 


GAG 


AAG 


AAG 


TGC 


AAT 


GGA 


TTC 


CGC 


TGC 


CCA 


AAC 


GGC 


ACT 


TGC 


3827 
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Lys 
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Asn 


Gly 


Phe 


Arg 


Cys 


Pro 


Asn 


Gly 


Thr 


Cys 






1235 








1240 
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TGT 


GAT 


GGT 
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CGT 


GAT 
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TCT 


GAT 


GGC 


3875 


He 


Pro 


Ser 


Ser 


Lys 


His 


Cys 


Asp 


Gly 


Leu 


Arg 


Asp 


Cys 


Ser 


Asp 


Gly 
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1255 
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1265 
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His 
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4019 
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Ala 
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TTC CAG TGT CAG AAT GGA GTG TGC ATC AGT TTG ATT TGG AAQ TGC GAC 4 115 
Phe Gin Cys Gin Asn Gly Val Cys lie Ser Leu lie ^rp Lys Cys Asp 
1330 1335 1340 1345 

GGG ATG GAT GAT TGC GGC GAT " 'AT TCT GAT GAA GCC AAC TGC GAA AAC 4163 
5 Gly Met Asp Asp Cys Gly Asp Tyx Ser As P Giu Ala CY 3 Glu Asn 

1350 1355 1360 

CCC ACA GAA GCC CCA AAC TGC TCC CGC TAC TTC CAG TTT CGG TGT GAG 4211 
Pro Thr Glu Ala Pro Asn Cys Ser Arg Tyr Phe Gin Phe Arg Cys Glu 

1365 1370 1375 

AAT GGC CAC TGC ATC CCC AAC AGA TGG AAA TGT GAC AGG GAG AAC GAC 42 59 
70 Asn Gly His Cys lie Pro Asn Arg Trp Lys Cys Asp Arg Glu Asn Asp 

1380 1385 1390 

TGT GGG GAC TGG TCT GAT GAG AAG GAT TGT GGA GAT TCA CAT ATT CTT 43C7 
Cys Gly Asp Trp Ser Asp Glu Lys Asp Cys Gly Asp Ser His lie Leu 

13 95 1400 1405 

CCC TTC TCG ACT CCT GGG CCC TCC ACG TGT CTG CCC AAT TAC TAC CGC 4 35 5 
Pro Phe Ser Thr Pro Gly Pro Ser Thr Cys Leu Pro Asn Tyr Tyr Arg 
1410 1415 1420 1425 

TGC AGC AGT GGG ACC TGC GTG ATG GAC ACC TGG GTG TGC GAC GGG TAC 44 03 
Cys Ser Ser Gly Thr Cys Val Met Asp Thr Trp Val Cys Asp Gly Tyr 

1430 1435 1440 

CGA GAT TGT GCA GAT GGC TCT GAC GAG GAA GCC TGC CCC TTG CTT GCA 44 51 
Ara Asd Cys Ala Asp Gly Ser Asp Glu Glu Ala Cys Pro Leu Leu Ala 

20 tr 1 i445 145Q 1455 

AAC GTC ACT GCT GCC TCC ACT CCC ACC CAA CTT GGG CGA TGT GAC CGA 44 9 9 
Asn Val Thr Ala Ala Ser Thr Pro Thr Gin Leu Gly Arg Cys Asp Arg 

1460 1465 1470 

TTT GAG TTC GAA TGC CAC CAA CCG AAG ACG TGT ATT CCC AAC TGG AAG 4 54 7 
Phe Glu Phe Glu Cys His Gin Pro Lys Thr Cys lie Pro Asn Trp Lys 
25 147 5 1480 1485 

CGC TGT GAC 'GGC CAC CAA GAT TGC CAG GAT GGC CGG GAC GAG GCC AAT 4 595 

Ara Cys Asp Gly His Gin Asp Cys Gin Asp Gly Arg Asp Glu Ala Asn 

1490 1495 1500 1505 

TGC CCC ACA CAC AGC ACC TTG ACT TGC ATG AGC AGG GAG TTC CAG TGC 4 64 3 

Cys Pro Thr His Ser Thr Leu Thr Cys Met Ser Arg Glu Phe Gin Cys 

1510 1515 1520 

GAG GAC GGG GAG GCC TGC ATT GTG CTC TCG GAG CGC TGC GAC GGC TTC 4 6 91 
Glu Asp Gly Glu Ala Cys He Val Leu Ser Glu Arg Cys Asp Gly Phe 

1525 1530 1535 

CTG GAC TGC TCG GAC GAG AGC GAT GAA AAG GCC TGC AGT GAT GAG TTG 4 73 9 
Leu Asp Cys Ser Asp Glu Ser Asp Glu Lys Ala Cys Ser Asp Glu Leu 
^ 1540 1545 1550 

ACT GTG TAC AAA GTA CAG AAT CTT CAG TGG ACA GCT GAC TTC TCT GGG 4 787 
Thr Val Tyr Lys Val Gin Asn Leu Gin Trp Thr Ala Asp Phe Ser Gly 

1555 1560 1565 

GAT GTG ACT TTG ACC TGG ATG AGG CCC AAA AAA ATG CCC TCT GCA TCT 4 83 5 
Asp Val Thr Leu Thr Trp Met Arg Pro Lys Lys Met Pro Ser Ala Ser 
1570 1575 1580 1535 

4C TGT GTA TAT AAT GTC TAC TAC AGG GTG GTT GGA GAG AGC ATA TGG AAG 4 883 

Cys Val Tyr Asn Val Tyr Tyr Arg Val Val Gly Glu Ser He Trp Lys 

r ^590 1595 1600 

ACT CTG GAG ACC CAC AGC AAT AAG ACA AAC ACT GTA TTA AAA GTC TTG 4 931 
Thr Leu Glu Thr His Ser Asn Lys Thr Asn Thr Val Leu Lys Val Leu 
1605 1610 1615 

45 AAA CCA GAT ACC ACG TAT CAG GTT AAA GTA CAG GTT CAG TGT CTC AGC 4 979 

Lys Pro Asp Thr Thr Tyr Gin Val Lys Val Gin Val Gin Cys Leu Ser 

1620 1625 1530 

AAG GCA CAC AAC ACC AAT GAC TTT GTG ACC CTG AGG ACC CCA GAG GGA 502 7 
Lys Ala His Asn Thr Asn Asp Phe Val Thr Leu Arg Thr Pro Glu dy 
1635 1640 1645 

'0 TTG CCA GAT GCC CCT CGA AAT CTC CAG CTG TCA CTC CCC AGG GAA GCA 5075 

Leu Pro Asp Ala Pro Arg Asn Leu Gin Leu Ser Leu Pro Arg Glu Ala 
1650 1655 1660 1665 

GAA GGT GTG ATT GTA GGC CAC TGG GCT CCT CCC ATC CAC ACC CAT GGC 512 3 
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Asn 


Lys 


Thr 


Val 


He 


Val 


Ser 


Lys 


Asp 


Glu 


Gin 


Tyr 




189C 








1895 








1900 








1905 




TTG 


TTT 


CTG 


GTC 


CGT 


GTA 


GTG 


GTA 


CCC 


TAC 


CAG 


GGG 


CCA 


TCC 


TCT 


GAC 


5843 


Leu 


Phe 


Leu 


Val 


Arg 


Val 


Val 


Val 


Pro 


Tyr 


Gin 


Gly 


Pro 


Ser 


Ser 


Asp 












1910 








1915 








1920 




TAC 


GTT 


GTA 


GTG 


AAG 


ATG 


ATC 


CCG 


GAC 


AGC 


AGG 


CTT 


CCA 


CCC 


CGT 


CAC 


5891 


Tyr 


Val 


Val 


Val 


Lys 


Met 


lie 


Pro 


Asp 


Ser 


Arg 


Leu 


Pro 


Pro 


Arg 


His 










1925 








1930 








1935 






CTG 


CAT 


GTG 


GTT 


CAT 


ACG 


GGC 


AAA 


ACC 


TCC 


GTG 


GTC 


ATC 


AAG 


TGG 


GAA 


593 9 


Leu 


His 


Val 


Val 


His 


Thr 


Gly 


Lys 


Thr 


Ser 


Val 


Val 


He 


Lys 


Trp 


Glu 








1940 








1945 








1950 








TCA 


CCG 


TAT 


GAC 


TCT 


CCT 


GAC 


CAG 


GAC 


TTG 


TTG 


TAT 


GCA 


ATT 


GCA 


GTC 


5 987 


Ser 


Pro 


Tyr 


Asp 


Ser 


Pro 


Asp 


Gin 


Asp 


Leu 


Leu 


Tyr 


Ala 


lie 


Ala 


Val 






1955 








1960 








1965 










AAA 


GAT 


CTC 


ATA 


AGA 


AAG 


ACT 


GAC 


AGG 


AGC 


TAC 


AAA 


GTA 


AAA 


TCC 


CGT 


6035 


Lys 


Asp 


Leu 


He 


Arg 


Lys 


Thr 


Asp 


Arg 


Ser 


Tyr 


Lys 


Val 


Lys 


Ser 


Arg 




197G 








1975 








1980 








1985 




AAC 


AGC 


ACT 


GTG 


GAA 


TAC 


ACC 


CTT 


AAC 


AAG 


TTG 


GAG 


CCT 


GGC 


GGG 


AAA 


6083 


Asn 


Ser 


Thr 


Val 


Glu 


Tyr 


Thr 


Leu 


Asn 


Lys 


Leu 


Glu 


Pro 


Gly 


Gly 


Lys 












1990 








1995 








2000 




TAC 


CAC 


ATC 


ATT 


GTC 


CAA 


CTG 


GGG 


AAC 


ATG 


AGC 


AAA 


GAT 


TCC 


AGC 


ATA 


6131 


Tyr 


His 


He 


He 


val 


Gin 


Leu 


Gly Asn 


Met 


Ser 


Lys 


Asp 


Ser 


Ser 


He 
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25 



AAA 


ATT 


ACC 


ACA 


u 1 l 


TCA 


TTA 


TCA 


GCA 




GAT 


GCC 


TTA 


AAA 


A TV ' 


J* 1 ' v*A 


617 9 


Lys 


He 


Thr 


Thr 


Val 


Ser 


Leu 


Ser 


Ala 


Pro 


Asp 


Ala 


Leu 


Lys 


He 


He 






2020 








2025 








2030 








AL-A 


GAA 


AAT 


GAT 


CAT 


GTT 


CTT 


CTG 


TTT 


TGG 


AAA 


AGC 


CTG 


GCT 


TTA 


AAG 


9 ? 7 


Thr 


Glu 


Asn 


Asp 


His 


Val 


Leu 


Leu 


Phe 


Trp 


Lys 


Ser 


Leu 


Ala 


Leu 


Lys 






2035 








2040 








2045 












AAG 


CAT 


TTT 


AAT 


GAA 


AGC 


AGG 


GGC 


TAT 


GAG 


ATA 


CAC 


ATG 


TTT 


GAT 


6 2 75 


pin 


Lys 


His 


Phe 


Asn 


Glu 


Ser 


Arg 


Gly 


Tyr 


Glu 


Tie 


His 


Met 


Phe 


Asp 




oner 








2055 








2060 








2065 




AGT 


GCC 


ATG 


AAT 


ATC 


ACA 


GCT 


TAC 


CTT 


GGG 


AAT 


ACT 


ACT 


GAC 


AAT 


TTC 


63 2 3 


Ser 


Ala 


Met 


Asn 


He 


Thr 


Ala 


Tyr 


Leu 


Gly 


Asn 


Thr 


Thr 


Asp 


Asn 


Phe 












2070 








2075 








2080 




TTT 


AAA 


ATT 


TCC 


AAC 


CTG 


AAG 


ATG 


GGT 


CAT 


AAT 


TAC 


ACG 


TTC 


ACC 


GTC 


6 3 71 


Phe 


Lys 


He 


Ser 


Asn 


Leu 


Lys 


Met 


Gly 


His 


Asn 


Tyr 


Thr 


Phe 


Thr 


Val 








2085 








2090 








2095 






CAA 


GCA 


AGA 


TGC 


CTT 


TTT 


GGC 


AAC 


CAG 


ATC 


TGT 


GGG 


GAG 


CCT 


GCC 


ATC 


6419 


Gin 


Ala 


Arg 


Cys 

) 


Leu 


Phe 


Gly 


Asn 


Gin 


He 


Cys 


Gly 


Glu 


Pro 


Ala 


He 








210C 








2105 








2110 








CTG 


CTG 


TAC 


GAT 


GAG 


CTG 


GGG 


TCT 


GGT 


GCA 


GAT 


GCA 


TCT 


GCA 


ACG 


CAG 


6 4 6 7 


Leu 


Leu 


Tyr 


Asp 


Glu 


Leu 


Gly 


Ser 


Gly 


Ala 


Asp 


Ala 


Ser 


Ala 


Thr 


Gin 






2115 








2120 








2125 










GCT 


GCC 


AGA 


TCT 


ACG 


GAT 


GTT 


GCT 


GCT 


GTG 


GTG 


GTG 


CCC 


ATC 


TTA 


TTC 


6515 


Ala 


Ala 


Arg 


Ser 


Thr 


Asp 


Val 


Ala 


Ala 


Val 


Val 


Val 


Pro 


He 


Leu 


Phe 




2130 






2135 








2140 








2145 




CTG 


ATA 


CTG 


CTG 


AGC 


CTG 


GGG 


GTG 


GGG 


TTT 


GCC 


ATC 


CTG 


TAC 


ACG 


AAG 


6563 


Leu 


11* 


Leu 


Leu 


Ser 


Leu 


Gly 


Val 


Gly 


Phe 


Ala 


He 


Leu 


Tyr 


Thr 


Lys 












2150 








2155 








2160 




CAC 


CGG 


AGG 


CTG 


CAG 


AGC 


AGC 


TTC 


ACC 


GCC 


TTC 


GCC 


AAC 


AGC 


CAC 


TAC 


6611 


His 


Arg 


Arg 


Leu 


Gin 


Ser 


Ser 


Phe 


Thr 


Ala 


Phe 


Ala 


Asn 


Ser 


His 


Tyr 






2165 








2170 








2175 






AGC 


TCC 


AGG 


CTG 


GGG 


TCC 


GCA 


ATC 


TTC 


TCC 


TCT 


GGG 


GAT 


GAC 


CTG 


GGG 


6659 


Ser 


Ser 


Arg 


Leu 


Gly 


Ser 


Ala 


lie 


Phe 


Ser 


Ser 


Gly 


Asp 


Asp 


Leu 


Gly 





2180 2135 2190 



GAA GAT GAT GAA GAT GCC CCT ATG ATA ACT GGA TTT TCA GAT GAC GTC 6 70 7 
Glu Asp Asp Glu Asp Ala Pro Met He Thr Gly Phe Ser Asp Asp Val 

2195 2200 2205 

CCC ATG GTG ATA GCC TGAAAGAGCT TTCCTCACTA GAAACCAAAT GGTGTAAATA 6 76 2 
Pro Met Vai lie Ala 
2210 

TTTTATTTGA TAAAGATAGT TGATGGTTTA TTTTAAAAGA TGCACTTTGA GTTGCAATAT 6 32 2 
GTTATTTTTA TATGGGCCAA A 6 84 3 



Claims 

1 . DNA having a nucleotide sequence as shown by Sequence ID No. 1 

2. An LDL receptor analog protein having an ammo acid sequence as shown by Sequence ID No. 2 and coded by 

DNA of Claim 1. 

3. DNA having a nucleotide sequence as shown by Sequence ID No 5 

4. An LDL receptor analog protein having an ammo acid sequence as shown by Sequence ID No. 6 and coded by 
DNA of Claim 3. 

5. A recombinant vector comprising DNA as shown by Sequence ID No. 1 or 5 and a replicabie vector. 

6. Transformant cells which harbor the recombinant vector of Claim 5. 
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7. A method for the production of an LDL receptor analog protem comprising the steps of curturing the transformants 
of Claim 6 and collecting a polypeptide produced in the culture. 



BNSDOC:D <Et= __277329CA2 



104 



(19) 



J 



(12) 



EuropSisches Patentamt 
European Patent Office 
Office europeen des brevets (11) EP 0 773 290 A3 

EUROPEAN PATENT APPLICATION 



(88) Date of publication A3: 


(51) into. 6 : C12N 15/12, C07K 14/705 


08.07.1998 Bulletin 1998/28 


// C12N15/70, C12N15/79 


(43) Date of publication A2 




idf|c;iQQ7 Bulletin 1997/20 

m.U3- I DCI 1 IC till i j • ' £-\J 




(21) Application number: 96116108.0 




(22) Dateof filing: 08.10.1996 




(84) Designated Contracting States: 


• Iwasaki, Akio 


AT DC nc PM/ CC CI CD r~l Q riB IP IT 1 1 1 1 1 MP 

AT be Un Ufc L?r\. to rl rn ub un It: 1 1 LI LU mo 


Tsuchiura-shi, Ibaraki (JP) 


NL PT SE 


• Arai, Koichi 




Urawa-shi, Saitama (JP) 


(30) Priority 09.10.1995 JP 261440/95 


• Yamazaki, Hiroyuki 


24.04.1996 JP 102451/96 


Higashimurayama-shi, Tokyo (JP) 


(71 ) Applicant: KOWA COMPANY, LTD. 


(74) Representative: 


Naka-ku Nagoya-shi Aichi-ken (jr ' 


Wachtershauser, Gunter, Prof. Dr. 




Patentanwaft, 


(72) Inventors: 


Tal 29 


• Saito, Yasushi 


80331 Munchen (DE) 


Chiba-shi, Chiba (JP) 





(54) Novel LDL receptor analog protein and the gene coding therefor 

(57) The present invention is drawn to the gene of a 
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The invention provides DNA having a nucleotide 
sequence as shown by Sequence ID No. 1 or No 5 is 
disclosed as well as rabbit tissue or human tissue LDL 
receptor analog protein having an amino acid sequence 
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